[英]Azure DevOps Self hosted agent error connectivity issues
We are using Azure DevOps Self hosted agents to build and release our application.我们正在使用 Azure DevOps 自托管代理来构建和发布我们的应用程序。 Often we are seeing below error and recovering automatically.
我们经常看到以下错误并自动恢复。 Does anyone know what is this error,how to tackle this and where to exactly check logs about the error?
有谁知道这个错误是什么,如何解决这个问题以及在哪里准确检查有关错误的日志?
We stopped hearing from agent <agent name>. Verify the agent machine is running and has a healthy network connection. Anything that terminates an agent process, starves it for CPU, or blocks its network access can cause this error. For more information, see: https://go.microsoft.com/fwlink?Linkid=846610
This seems to be a known issue with both self-hosted and Microsoft-hosted agents that many people have been reporting .这似乎是许多人 报告的自托管和 Microsoft 托管代理的一个已知问题。
Quoting the reply from @zachariahcox
from the Azure Pipelines Product Group:引用 Azure 管道产品组的
@zachariahcox
的回复:
To provide some context, the azure pipelines agent is composed of two processes:
agent.listener
andagent.worker
(one of these perstep
in thejob
).为了提供一些上下文,azure 管道代理由两个进程组成:
agent.listener
和agent.worker
(job
中的每个step
中的一个)。 The listener is responsible for reporting that workers are still making progress.侦听器负责报告工人仍在取得进展。 If the
agent.listener
is unable to communicate with the server for 10 minutes (we attempt to communicate every minute), we assume something has Gone Wrong and abandon the job.如果
agent.listener
在 10 分钟内无法与服务器通信(我们尝试每分钟都进行通信),我们会认为出现问题并放弃工作。So, if you're running a private machine, anything that can interfere with the listener's ability to communicate with our server is going to be a problem.
因此,如果您运行的是私人机器,任何可能干扰侦听器与我们的服务器通信的能力都会成为问题。
Among the issues i've seen are anti-virus programs identifying it as a threat, local proxies acting up in various ways, the physical machine running out of memory or disk space (quite common), the machine rebooting unexpectedly, someone ctrl+c'ing the whole listener process, the work payload being run at a way higher priority than the listener (thus "starving" the listener out), unit tests shutting down network adapters (quite common), having too many agents at normal priority on the same machine so they starve each other out, etc.
我见过的问题包括将其识别为威胁的防病毒程序、本地代理以各种方式运行、物理机器用完 memory 或磁盘空间(很常见)、机器意外重启、有人 ctrl+c 'ing 整个侦听器进程,工作负载以比侦听器更高的优先级运行(因此“饿死”侦听器),单元测试关闭网络适配器(很常见),在正常优先级上有太多代理同一台机器,所以他们互相饿死,等等。
If you think you're seeing an issue that cannot be explained by any of the above (and nothing jumps out at you from the
_diag
logs folder), please file an issue at https://azure.microsoft.com/en-us/support/devops/如果您认为您遇到的问题无法通过上述任何方式解释(并且没有任何内容从
_diag
日志文件夹中跳出来),请在https://azure.microsoft.com/en-us 提交问题/支持/devops/
If everything seems to be perfectly alright with your agent and none of the steps mentioned in the Pipeline troubleshooting guide help, please report it on Developer Community where the Azure DevOps Team and DevOps community are actively answering questions.如果您的代理似乎一切正常,并且管道故障排除指南中提到的任何步骤都没有帮助,请在 Azure DevOps 团队和 DevOps 社区积极回答问题的开发人员社区上报告。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.