自托管 AzureDevOps 代理在几分钟不活动后断开连接,并出现错误“套接字错误:ConnectionReset”

问题描述 投票:0回答:1

我正在 GCP VM 上运行 AzureDevOps Pipeline。 我已成功将其中一台 GCP 虚拟机作为自托管代理附加到 AzureDevOps,以在 GCP 中配置资源。

我可以按预期成功运行所有管道并在 GCP 中配置资源。

但是,几分钟的空闲时间后,“自托管代理(GCP VM)”显示为离线。

要使代理上线,我需要登录 GCP VM 并运行

./run.sh &
命令以允许代理再次运行。

从 Azure 代理日志中,我看到的错误是

Socket Error: ConnectionReset
下面的完整日志供参考,请注意,一些值被编辑了

[2024-11-25 14:16:52Z INFO JobDispatcher] Worker finished for job 555555-55555-55555-5555-555555. Code: 102
[2024-11-25 14:16:52Z INFO JobDispatcher] finish job request for job 555555-55555-55555-5555-555555 with result: Failed
[2024-11-25 14:16:52Z INFO Terminal] WRITE LINE: 2024-11-25 14:16:52Z: Job datadog_agent_install completed with result: Failed
[2024-11-25 14:16:52Z INFO JobDispatcher] Stop renew job request for job 555555-55555-55555-5555-555555.
[2024-11-25 14:16:52Z INFO JobDispatcher] job renew has been canceled, stop renew job request 99020.
[2024-11-25 14:16:52Z INFO JobNotification] Entering JobCompleted Notification
[2024-11-25 14:16:52Z INFO JobNotification] Entering EndMonitor
[2024-11-25 14:17:15Z INFO MessageListener] Sleeping for 10.247 seconds before retrying.
[2024-11-25 14:33:54Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 5214, session '11111111-bbbb-cccc-dddd-eeeeeeeeeeee'.
[2024-11-25 14:34:20Z INFO MessageListener] Sleeping for 5.217 seconds before retrying.
[2024-11-25 14:35:14Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 5214, session '11111111-bbbb-cccc-dddd-eeeeeeeeeeee'.
[2024-11-25 14:35:15Z INFO MessageListener] Sleeping for 7.191 seconds before retrying.
[2024-11-25 14:35:44Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 5214, session '11111111-bbbb-cccc-dddd-eeeeeeeeeeee'.
[2024-11-25 14:36:13Z INFO MessageListener] Sleeping for 9.866 seconds before retrying.
[2024-11-25 14:37:04Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 5214, session '11111111-bbbb-cccc-dddd-eeeeeeeeeeee'.
[2024-11-25 14:37:13Z INFO MessageListener] Sleeping for 14.743 seconds before retrying.
[2024-11-25 14:38:18Z INFO MessageListener] Sleeping for 13.426 seconds before retrying.
[2024-11-25 14:38:25Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 5214, session '11111111-bbbb-cccc-dddd-eeeeeeeeeeee'.
[2024-11-25 14:38:31Z WARN VisualStudioServices] Attempt 1 of GET request to https://<dns-url>/<organization>/_apis/distributedtask/pools/275/messages failed (Socket Error: ConnectionReset). The operation will be retried in 11.0614112 seconds.
[2024-11-25 14:39:32Z INFO MessageListener] Sleeping for 13.085 seconds before retrying.
[2024-11-25 14:39:45Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 5214, session '11111111-bbbb-cccc-dddd-eeeeeeeeeeee'.
[2024-11-25 14:40:35Z INFO MessageListener] Sleeping for 11.614 seconds before retrying.
[2024-11-25 14:41:05Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 5214, session '11111111-bbbb-cccc-dddd-eeeeeeeeeeee'.
[2024-11-25 14:41:37Z INFO MessageListener] Sleeping for 6.738 seconds before retrying.
[2024-11-25 14:41:44Z INFO MessageListener] Sleeping for 13.28 seconds before retrying.
[2024-11-25 14:42:25Z INFO MessageListener] Sent GetAgentMessage to keep alive agent 5214, session '11111111-bbbb-cccc-dddd-eeeeeeeeeeee'.
[2024-11-25 14:42:48Z INFO MessageListener] Sleeping for 10.344 seconds before retrying.

我用于注册自托管代理(GCP VM)的命令

<<PATH>>/config.sh --replace --work work --acceptTeeEula --url https://<domain>/<org>/ --auth pat --token <<PAT-TOKEN>> --agent <my-agent> --pool <my-agent-pool> && <<PATH>>/run.sh

然后:

<<PATH>>/run.sh &

就注册而言,我没有看到问题,因为我们能够成功运行作业。唯一的问题是,代理在闲置几分钟(可能是 15-25 分钟)后就会离线。

为了排除代理是否存在任何问题,我使用了 docker 容器映像并成功将此容器添加为自托管代理。即使在这种情况下,我也能够成功运行所有管道,但在几分钟(可能是 15-25 分钟)不活动后,代理再次离线。另外,谷歌搜索了一些其他解决方案,但找不到答案。

GCP 虚拟机实例详细信息

NAME="CentOS Stream"
VERSION="9"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="9"
PLATFORM_ID="platform:el9"
PRETTY_NAME="CentOS Stream 9"
ANSI_COLOR="0;31"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:centos:centos:9"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://issues.redhat.com/"
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 9"
REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"

请求帮助解决问题,以便自托管代理应保持在线状态而不会断开连接

azure-devops tfs azure-pipelines google-compute-engine azure-devops-self-hosted-agent
1个回答
0
投票

下面的解决方案帮助了我,需要将代理作为服务运行,因此它会自动与虚拟机一起启动并在后台保持运行:

A) Linux 虚拟机:

  1. 导航到 GCP 虚拟机上的代理目录。
  2. 将代理配置为作为服务运行:
sudo ./svc.sh install
sudo ./svc.sh start
  1. 验证服务正在运行: sudo systemctl 状态

B) Windows 虚拟机:

  1. 以管理员身份打开命令提示符或 PowerShell。
  2. 导航至代理目录
  3. 通过在代理目录中运行以下命令将代理安装为服务:
.\svc.cmd install
.\svc.cmd start
  1. 在 Windows 计算机上打开服务应用程序(搜索 services.msc)。
  2. 寻找名为 vsts.agent 之类的服务....
  3. 确保服务状态为正在运行。
© www.soinside.com 2019 - 2024. All rights reserved.