简体   繁体   English

Jenkins:队列中的作业被卡住,没有被触发重新启动

[英]Jenkins: jobs in queue are stuck and not triggered to be restarted

For a while, our Jenkins experiences critical problems.有一段时间,我们的 Jenkins 遇到了严重的问题。 We have jobs hung, our job scheduler does not trigger the builds.我们有作业挂起,我们的作业调度程序不会触发构建。 After the Jenkins service restart, everything is back to normal, but after some period of time all problem are return. Jenkins服务重启后,一切恢复正常,但过一段时间又出现问题。 (this period can be week or day or ever less). (这段时间可以是一周或一天或更短)。 Any idea where we can start looking?知道我们可以从哪里开始寻找吗? I'll appreciate any help on this issue我将不胜感激在这个问题上的任何帮助

Muatik has made a good point in his comment, the recommended approach is to run jobs on agents (slave) nodes. Muatik在他的评论中指出了一个很好的观点,推荐的方法是在代理(从属)节点上运行作业。 If you already do it, you can look at: 如果已经这样做,则可以查看:

  1. Jenkins master machine CPU, RAM and hard disk usage. Jenkins主控机的CPU,RAM和硬盘使用情况。 Access the machine and/or use plugin like Java Melody . 访问机器和/或使用Java Melody之类的插件。 I have seen missing graphics in the builds test results and stuck builds due to no hard disk space. 我发现版本测试结果中缺少图形,并且由于没有硬盘空间而导致版本卡住。 You could also have hit the limit of RAM or CPU for the slaves/jobs you are executing. 对于执行的从站/作业,您可能还达到了RAM或CPU的限制。 You may need more heap space . 您可能需要更多的堆空间
  2. Look at Jenkins log files, start with severe exceptions. 查看Jenkins日志文件,从严重的异常开始。 If the files are too big or you see logrotate exceptions, you can change the logging levels, so that fewer exceptions are logged. 如果文件太大或看到logrotate异常,则可以更改日志记录级别,以使记录的异常更少。 For more details see my article on this topic . 有关更多详细信息,请参阅我关于此主题的文章 Try to fix exceptions that you see logged. 尝试修复您看到的异常记录。
  3. Go through recently made changes that can be the cause of such behavior, for example, new plugins, changes to config files (jenkins.xml)? 经历可能导致这种行为的最近更改,例如,新插件,对配置文件的更改(jenkins.xml)?

  4. Look at TCP connections. 查看TCP连接。 Run netstat -a Are there suspicious connections (CLOSED_WAIT status)? 运行netstat -a是否存在可疑的连接(CLOSED_WAIT状态)?

  5. Delete old builds that you do not need. 删除不需要的旧版本。

We have been facing this issue from last 4 months, and tried everything, changing resources CPU & memory, increasing desired nodes in ASG.从过去 4 个月开始,我们一直面临这个问题,并尝试了一切,更改资源 CPU 和 memory,增加 ASG 中的所需节点。 But nothing seems worked.但似乎没有任何效果。

Solution: 1. Go to Manage Jnekins-> System Configurationd-> Maven project configurations 2. In "usage" field, select "Only buid Jobs with label expressions matching this nodes"解决方案: 1. Go Manage Jnekins-> System Configurationd-> Maven project configurations 2. 在“usage”字段中,select “Only buid Jobs with label expressions matching this nodes”

Doing this resolved it and jenkins is working as a Rocket now:)这样做解决了它,jenkins 现在正在作为火箭工作:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM