简体   繁体   English

Java Object.notifyAll() 长时间阻塞,同时消耗高 CPU

[英]Java Object.notifyAll() blocks for long time while consuming high CPU

We have a large enterprise telecom application running over OpenJDK8 (25.171-b10) on RHEL6.10.我们有一个大型企业电信应用程序在 RHEL6.10 上运行在 OpenJDK8 (25.171-b10) 上。

We observe a very strange issue that occurs randomly in production servers.我们观察到在生产服务器中随机发生的一个非常奇怪的问题。

The process starts consuming high CPU and eventually we enter memory congestion due to Java heap being nearly full.该进程开始消耗大量 CPU,最终由于 Java 堆几乎已满,我们进入 memory 拥塞。

After Analyzing the Java process using thread dumps, we noticed that a thread is blocked inside Java “notifyAll” method.在使用线程转储分析 Java 进程后,我们注意到在 Java “notifyAll”方法中阻塞了一个线程。 The thread remains blocked inside this method for very long time, and the process consumes high CPU while the thread remains blocked inside this method.线程在这个方法中被阻塞了很长时间,进程消耗很高的CPU,而线程在这个方法中仍然被阻塞。

Below is the stack trace for the thread that remains blocked -下面是仍然被阻塞的线程的堆栈跟踪 -

"pool-5-thread-1" #432 prio=5 os_prio=0 tid=0x00007f7c85a57000 nid=0x37fe runnable [0x00007f7c3cba0000]
   java.lang.Thread.State: RUNNABLE
       at java.lang.Object.notifyAll(Native Method)
       at gov.nist.javax.sip.parser.Pipeline.close(Pipeline.java:165)
       - locked <0x00000005c16007b8> (a java.util.LinkedList)
       at gov.nist.javax.sip.stack.NioTcpMessageChannel.close(NioTcpMessageChannel.java:258)
       at gov.nist.javax.sip.stack.ConnectionOrientedMessageChannel.close(ConnectionOrientedMessageChannel.java:203)
       at gov.nist.javax.sip.stack.SocketTimeoutAuditor.runTask(SocketTimeoutAuditor.java:74)

The thread remains stuck inside “notifyAll()” for several minutes while consuming high CPU.线程在“notifyAll()”中停留几分钟,同时消耗大量 CPU。

To provide more information on the Java wait/notify mechanism being used:要提供有关正在使用的 Java 等待/通知机制的更多信息:

We have a class “Pipeline” which has a private member “buffList” which is of type LinkedList.我们有一个 class “管道”,它有一个私有成员“buffList”,它是 LinkedList 类型。

An instance of Pipeline starts a thread which is synchronized on “buffList” and calls 'this.buffList.wait()'. Pipeline 的一个实例启动一个在“buffList”上同步的线程并调用“this.buffList.wait()”。

public int read() throws IOException {

    synchronized (this.buffList) {

       ...

       this.buffList.wait();

       ...

}

When the Pipeline instance has some data or needs to be closed, another independent thread would call notifyAll().当 Pipeline 实例有一些数据或需要关闭时,另一个独立的线程会调用 notifyAll()。 Example snippet shows close():示例片段显示了 close():

public void close() throws IOException {

        ...

        synchronized (this.buffList) {

            this.buffList.notifyAll();

        }

}

Normally, this works fine but for some reason, (that currently we neither can understand, neither can replicate), the call to 'notifyAll()' above not only blocks for several minutes but also causes high CPU usage.通常,这可以正常工作,但由于某种原因(目前我们既无法理解,也无法复制),上面对“notifyAll()”的调用不仅会阻塞几分钟,还会导致 CPU 使用率过高。

Since we tried with multiple ways to replicate the issue but currently appears only in production does anyone faced similar issue?由于我们尝试了多种方法来复制问题,但目前只出现在生产中,是否有人遇到过类似的问题? Any idea on how we can troubleshoot/find the root cause of this behavior?关于我们如何排除故障/找到这种行为的根本原因的任何想法?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM