Azure 上的 Lucee/Java 的 CPU 使用率为 100%

Question

On today's episode of wtf is going on.....今天的wtf剧集正在进行中......

Issue we've been experiencing for the last few days.过去几天我们一直在遇到的问题。 For some reason Java is getting pegged at 100% cpu usage.出于某种原因，Java 被 100% 的 CPU 使用率盯上了。 If we disable the lucee service, the cpu usage drops to a normal level.如果我们禁用 lucee 服务，cpu 使用率会下降到正常水平。 Once we enable it, the cpu usage immediately spikes up to 100%.一旦我们启用它，cpu 使用率会立即飙升至 100%。

Full path is完整路径是

/opt/lucee/jdk/jre/bin/java -Djava.util.logging.config.file=/opt/lucee/tomcat/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager

Our other servers running Lucee show this process to normally have around 1-3% cpu usage.我们运行 Lucee 的其他服务器显示此过程通常具有大约 1-3% 的 CPU 使用率。

We've restarted Ubuntu multiple times to no avail.我们已经多次重新启动 Ubuntu 无济于事。
We've disabled all ports in case there was some weird traffic causing issues, again to no avail.我们已经禁用了所有端口，以防出现一些奇怪的流量导致问题，再次无济于事。
We've verified there are no lucee scripts tasks running.我们已经验证没有正在运行的 lucee 脚本任务。
We do have a separate SQL Service instance running with Microsoft which does not show any weird DB usage.我们确实有一个与 Microsoft 一起运行的单独 SQL 服务实例，它没有显示任何奇怪的数据库使用情况。 We also have around 5 other servers which access that same DB and which are not experiencing this issue.我们还有大约 5 台其他服务器可以访问同一个数据库并且没有遇到此问题。
I've downgraded Lucee from 5.3.4.80 to 5.2.9.3, again no luck我已经将 Lucee 从 5.3.4.80 降级到 5.2.9.3，再次没有运气

Other info for the server服务器的其他信息

OS      Linux (4.4.0-174-generic) 64bit
Servlet Container   Apache Tomcat/8.5.6
Java    1.8.0_112 (Oracle Corporation) 64bit

This originally happened to us about two weeks ago.这最初发生在我们大约两周前。 We came in on Monday 2/24, and by about 9am CT, one of our servers started experiencing this issue.我们于 2 月 24 日星期一进来，到美国中部时间上午 9 点左右，我们的一台服务器开始遇到此问题。 We setup a separate Azure VM and copied all our files over and got everything up and running just fine.我们设置了一个单独的 Azure VM 并复制了我们所有的文件，然后一切正常并运行良好。 Now, two weeks later two other servers are starting to have this same issue.现在，两周后另外两台服务器开始出现同样的问题。

Appreciate any help you guys can provide.感谢你们提供的任何帮助。

Answer 1

I had the same issue on an AWS server running Ubuntu and Java 8_181.我在运行 Ubuntu 和 Java 8_181 的 AWS 服务器上遇到了同样的问题。 It started suddenly in the middle of the night.它在半夜突然开始。 Top was showing 2 CPU's fully loaded, just like yours. Top 显示 2 个 CPU 满载，就像你的一样。 Restarting Lucee/Tomcat and rebooting had no effect.重新启动 Lucee/Tomcat 并重新启动没有任何效果。

Fusionreactor pointed to an issue with Scheduled Tasks with the Thread Visiualizer showing two spinning tasks, with stack traces similar to the below. Fusionreactor 指出了 Thread Visiualizer 显示两个旋转任务的计划任务的问题，堆栈跟踪类似于以下内容。

I killed these threads, and the spinning stopped.我杀死了这些线程，旋转停止了。 I could then see two of my scheduled tasks marked in pink as permanently stopped in the Lucee administrator.然后我可以看到我的两个计划任务在 Lucee 管理员中标记为粉红色永久停止。 Re-enabling these processes and restarting Lucee brought the problem back, so I killed them again and again the tasks went pink in the Lucee Administrator.重新启用这些进程并重新启动 Lucee 将问题带回来，所以我一次又一次地杀死它们，Lucee Administrator 中的任务变成粉红色。 They did not run on their normal schedule either.他们也没有按正常时间表运行。 Other scheduled tasks remained running OK, and after a few hours things were still normal.其他计划任务仍然运行正常，几个小时后一切仍然正常。

I then removed and recreated the two scheduled tasks that seemed to be the issue, and restarted Lucee.然后我删除并重新创建了两个似乎是问题的计划任务，然后重新启动了 Lucee。 The two tasks ran on schedule.两项任务如期进行。 I have therefore concluded that somehow the timing information for the two tasks had become corrupted, vausing the spin when Lucee was trying to calculate the next run time.因此，我得出结论，这两个任务的计时信息不知何故已损坏，当 Lucee 试图计算下一次运行时间时，这导致了自旋。 The Luce source code around the point it seemed to be spinning has a 'while(1)' loop which seemed to be incrementing a date variable - I suspect that's where things were stuck. Luce 源代码围绕它似乎旋转的点有一个“while(1)”循环，它似乎在增加一个日期变量——我怀疑这就是问题所在。

In summary, get shot of the scheduled tasks and recreate, and you may have a workaround.总之，拍摄计划任务并重新创建，您可能有一个解决方法。

java.util.SimpleTimeZone.getOffsets(SimpleTimeZone.java:551) - locked <0x5d7f0b89> (a java.util.SimpleTimeZone) java.util.SimpleTimeZone.getOffset(SimpleTimeZone.java:540) sun.util.calendar.ZoneInfo.getOffsets(ZoneInfo.java:293) sun.util.calendar.ZoneInfo.getOffsets(ZoneInfo.java:236) java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2340) java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2312) java.util.Calendar.setTimeInMillis(Calendar.java:1804) java.util.GregorianCalendar.add(GregorianCalendar.java:1076) lucee.runtime.schedule.ScheduledTaskThread.calculateNextExecution(ScheduledTaskThread.java:219) lucee.runtime.schedule.ScheduledTaskThread._run(ScheduledTaskThread.java:121) lucee.runtime.schedule.ScheduledTaskThread.run(ScheduledTaskThread.java:87) java.util.SimpleTimeZone.getOffsets(SimpleTimeZone.java:551) - 锁定 <0x5d7f0b89>（一个 java.util.SimpleTimeZone） java.util.SimpleTimeZone.getOffset(SimpleTimeZone.java:540) sun.util.calendar.ZoneInfo.getOffsets (ZoneInfo.java:293) sun.util.calendar.ZoneInfo.getOffsets(ZoneInfo.java:236) java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2340) java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2312) ) java.util.Calendar.setTimeInMillis(Calendar.java:1804) java.util.GregorianCalendar.add(GregorianCalendar.java:1076) lucee.runtime.schedule.ScheduledTaskThread.calculateNextExecution(ScheduledTaskThread.java:219) lucee.runtime.schedule .ScheduledTaskThread._run(ScheduledTaskThread.java:121) lucee.runtime.schedule.ScheduledTaskThread.run(ScheduledTaskThread.java:87)

Azure 上的 Lucee/Java 的 CPU 使用率为 100%

问题描述

1 个解决方案

解决方案1
0 2020-03-29 04:59:47

Azure 上的 Lucee/Java 的 CPU 使用率为 100%

问题描述

1 个解决方案

解决方案1 0 2020-03-29 04:59:47

解决方案1
0 2020-03-29 04:59:47