简体   繁体   English

Tomcat 100%CPU使用率

[英]Tomcat 100% CPU Usage

I tried every thing to find the cause but not sure whats happing, tried to jProfile, but its not getting any results as cpu is already at 100%. 我尝试了所有事情来找到原因,但不确定是什么,尝试对jProfile进行尝试,但由于cpu已经达到100%,因此没有得到任何结果。

I am running a Spring application (WAR file) on amazon linux on Tomcat8. 我正在Tomcat8的Amazon Linux上运行Spring应用程序(WAR文件)。 no database operations but yes it access a rest api for different operations running in separate environment. 没有数据库操作,但是是的,它为在单独环境中运行的不同操作访问rest api。 There are 3 servers have same issue running behind a load balancer in a beanstalk . beantalk中负载均衡器后面运行的服务器3个存在相同的问题。 with average of 518K requests per hour and average latency of 72.2 milliseconds 平均每小时518K请求,平均延迟为72.2毫秒

I feel like there is not much issue with the code itself 我觉得代码本身并没有太多问题

I just tried a command kill -3 pid (java process), and got some results in catalina.out file. 我只是尝试了命令kill -3 pid(java进程),并在catalina.out文件中得到了一些结果。 I can see following error again and again 我可以一次又一次看到以下错误

"http-nio-8080-exec-13" #42 daemon prio=5 os_prio=0 tid=0x00007f0898005800 nid=0xfb2 waiting on condition [0x00007f0882dec000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000f1694f58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:85)
    at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:31)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:745)

"http-nio-8080-exec-18" #41 daemon prio=5 os_prio=0 tid=0x00007f088c024000 nid=0xfb1 waiting on condition [0x00007f0882eed000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000f1694f58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:85)
    at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:31)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:745)

and at the end is following information 最后是以下信息

"VM Thread" os_prio=0 tid=0x00007f08d8081000 nid=0xf8b runnable 

"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f08d801e800 nid=0xf89 runnable 

"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f08d8020800 nid=0xf8a runnable 

"VM Periodic Task Thread" os_prio=0 tid=0x00007f08d80f0000 nid=0xf92 waiting on condition 

JNI global references: 478

Heap
 PSYoungGen      total 85504K, used 28368K [0x00000000fab00000, 0x0000000100000000, 0x0000000100000000)
  eden space 83968K, 33% used [0x00000000fab00000,0x00000000fc61cd08,0x00000000ffd00000)
  from space 1536K, 39% used [0x00000000ffd00000,0x00000000ffd97340,0x00000000ffe80000)
  to   space 1536K, 0% used [0x00000000ffe80000,0x00000000ffe80000,0x0000000100000000)
 ParOldGen       total 175104K, used 107142K [0x00000000f0000000, 0x00000000fab00000, 0x00000000fab00000)
  object space 175104K, 61% used [0x00000000f0000000,0x00000000f68a1a48,0x00000000fab00000)
 Metaspace       used 56563K, capacity 60632K, committed 60800K, reserved 1103872K
  class space    used 5218K, capacity 5736K, committed 5760K, reserved 1048576K

Could someone please explain what is happening here? 有人可以解释一下这里发生了什么吗?

Ok here are some of the logs I pulled from the log which are in running state 好的,这是我从日志中提取的一些处于running状态的日志

(couple of times in log) (日志中的几次)

"ajp-nio-8009-Acceptor-0" #23 daemon prio=5 os_prio=0 tid=0x00007f08d8535800 nid=0xfa0 runnable [0x00007f0883ffe000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
    - locked <0x00000000f0881618> (a java.lang.Object)
    at org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:682)
    at java.lang.Thread.run(Thread.java:745)

"ajp-nio-8009-ClientPoller-1" #22 daemon prio=5 os_prio=0 tid=0x00007f08d8534000 nid=0xf9f runnable [0x00007f08a83b2000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
    - locked <0x00000000f16a8100> (a sun.nio.ch.Util$2)
    - locked <0x00000000f16a80f0> (a java.util.Collections$UnmodifiableSet)
    - locked <0x00000000f16a7fc8> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
    at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1034)
    at java.lang.Thread.run(Thread.java:745)

"ajp-nio-8009-ClientPoller-0" #21 daemon prio=5 os_prio=0 tid=0x00007f08d82f8800 nid=0xf9e runnable [0x00007f08a84b3000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
    - locked <0x00000000f16a8900> (a sun.nio.ch.Util$2)
    - locked <0x00000000f16a88f0> (a java.util.Collections$UnmodifiableSet)
    - locked <0x00000000f16a87c8> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
    at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1034)
    at java.lang.Thread.run(Thread.java:745)

"http-nio-8080-Acceptor-0" #20 daemon prio=5 os_prio=0 tid=0x00007f08d82f7000 nid=0xf9d runnable [0x00007f08a85b4000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
    - locked <0x00000000f0882f68> (a java.lang.Object)
    at org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:682)
    at java.lang.Thread.run(Thread.java:745)

"http-nio-8080-ClientPoller-1" #19 daemon prio=5 os_prio=0 tid=0x00007f08d82f5800 nid=0xf9c runnable [0x00007f08a86b5000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
    - locked <0x00000000f1677900> (a sun.nio.ch.Util$2)
    - locked <0x00000000f16778f0> (a java.util.Collections$UnmodifiableSet)
    - locked <0x00000000f16777a8> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
    at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1034)
    at java.lang.Thread.run(Thread.java:745)

"http-nio-8080-ClientPoller-0" #18 daemon prio=5 os_prio=0 tid=0x00007f08d82f4000 nid=0xf9b runnable [0x00007f08a87b6000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
    - locked <0x00000000f1666290> (a sun.nio.ch.Util$2)
    - locked <0x00000000f1666280> (a java.util.Collections$UnmodifiableSet)
    - locked <0x00000000f1666138> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
    at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1034)
    at java.lang.Thread.run(Thread.java:745)

There is another one Not sure if its causing a problem (but only once in log) 还有一个不确定它是否引起问题(但仅在日志中出现一次)

"http-nio-8080-exec-131" #160 daemon prio=5 os_prio=0 tid=0x00007f088c100800 nid=0x153e runnable [0x00007f08727e5000]
   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:170)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
    - locked <0x00000000f688f4d0> (a java.io.BufferedInputStream)
    at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704)
    at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1536)
    - locked <0x00000000f6892258> (a sun.net.www.protocol.http.HttpURLConnection)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
    - locked <0x00000000f6892258> (a sun.net.www.protocol.http.HttpURLConnection)
    at sun.net.www.protocol.http.HttpURLConnection.getHeaderFields(HttpURLConnection.java:2966)
    at com.code.http.WebUtility.getUrlContents(WebUtility.java:163)

and the code on this lines is con.getHeaderFields() , where as con is HttpURLConnection 并且这行代码是con.getHeaderFields() ,其中con是HttpURLConnection

try {
    con = (HttpURLConnection) url.opencon();
    con.setDoInput(true);
    con.setDoOutput(true);
    con.connect();
    writeJson (con);

    this.header = con.getHeaderFields();//this is the line in running state
    this.code = con.getResponseCode();
    return readSuccessStream(con);
} catch (IOException e) {
    if (con != null) {
        return readFailureStream(con);
    }
    return "a nasty error occured";
} finally {
    if (con != null) {
        con.disconnect();
    }
}

and few other treads in running state 以及其他几个处于行驶状态的踏板

"Service Thread" #7 daemon prio=9 os_prio=0 tid=0x00007f08d80db000 nid=0xf91 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f08d80c8800 nid=0xf90 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f08d80bb000 nid=0xf8f waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f08d80b9000 nid=0xf8e waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

and the last one 最后一个

"main" #1 prio=5 os_prio=0 tid=0x00007f08d8009800 nid=0xf88 runnable [0x00007f08de871000]
   java.lang.Thread.State: RUNNABLE
    at java.net.PlainSocketImpl.socketAccept(Native Method)
    at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
    at java.net.ServerSocket.implAccept(ServerSocket.java:545)
    at java.net.ServerSocket.accept(ServerSocket.java:513)
    at org.apache.catalina.core.StandardServer.await(StandardServer.java:446)
    at org.apache.catalina.startup.Catalina.await(Catalina.java:717)
    at org.apache.catalina.startup.Catalina.start(Catalina.java:663)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:351)
    at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:485)

Thread states 线程状态

The states of a thread in a Java virtual machine are: Java虚拟机中线程的状态为:

NEW
A thread that has not yet started is in this state.

RUNNABLE
A thread executing in the Java virtual machine is in this state.

BLOCKED
A thread that is blocked waiting for a monitor lock is in this state.

WAITING
A thread that is waiting indefinitely for another thread to perform a particular action is in this state.

TIMED_WAITING
A thread that is waiting for another thread to perform an action for up to a specified waiting time is in this state.

TERMINATED
A thread that has exited is in this state.

In Java, threads are represented as Thread objects (java.lang.Thread). 在Java中,线程表示为Thread对象(java.lang.Thread)。 When you create a new thread, literally with new Thread() , you create a thread, but it isn't doing anything yet. 创建新线程时,实际上是使用new Thread()创建线程,但是它尚未执行任何操作。 Once you call Thread.start() , it moves from the NEW state into the RUNNABLE state. 一旦调用Thread.start() ,它就会从NEW状态变为RUNNABLE状态。 Runnable means it has pending work for the CPU to perform. 可运行意味着它有待执行的工作供CPU执行。 While running, a thread can move between RUNNABLE , BLOCKED , WAITING , and TIMED_WAITING . 在运行时,线程可以在RUNNABLEBLOCKEDWAITINGTIMED_WAITING之间移动。 When a thread exits, it is left in the TERMINATED state. 当线程退出时,它保持为TERMINATED状态。

When trying to diagnose high CPU usage by a Java process, we can rule out threads which are NEW or TERMINATED since they are dead. 当尝试通过Java进程诊断CPU使用率过高时,我们可以排除由于线程死亡而导致的NEWTERMINATED线程。 A thread which is BLOCKED , WAITING , or TIMED_WAITING are not doing anything. 处于BLOCKEDWAITINGTIMED_WAITING线程没有执行任何操作。 They are, literally, waiting for some event in the future. 从字面上看,他们正在等待将来的某些事件。

What does that mean? 这意味着什么? Only threads in the RUNNABLE state can be putting load on the CPU. 只有处于RUNNABLE状态的线程才能将负载施加到CPU上。

What are all those threads? 这些线程都是什么?

Tomcat uses a pool of threads to be able to handle multiple requests at the same time. Tomcat使用线程池能够同时处理多个请求。

This first thread is the acceptor thread. 该第一个线程是受体线程。 Its purpose is to listen on the network socket for incoming requests. 它的目的是在网络套接字上侦听传入的请求。 But in order to handle multiple requests simultaneously, it needs to delegate the work. 但是为了同时处理多个请求,它需要委派工作。 Otherwise, each request would lock up the server until it finishes, meaning only one user can connect to the web server at a time. 否则,每个请求都将锁定服务器,直到完成为止,这意味着一次只能有一个用户可以连接到Web服务器。 So this thread simple accepts the connections and then hands them off to another thread so it can accept more connections. 因此,该线程简单地接受连接,然后将其移交给另一个线程,以便它可以接受更多连接。

"http-nio-8080-Acceptor-0" #20 daemon prio=5 os_prio=0 tid=0x00007f08d82f7000 nid=0xf9d runnable [0x00007f08a85b4000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
    - locked <0x00000000f0882f68> (a java.lang.Object)
    at org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:682)
    at java.lang.Thread.run(Thread.java:745)

These threads are the worker threads to which Tomcat assigns the requests: 这些线程是Tomcat分配请求的工作线程:

"http-nio-8080-exec-13" #42 daemon prio=5 os_prio=0 tid=0x00007f0898005800 nid=0xfb2 waiting on condition [0x00007f0882dec000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000f1694f58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:85)
    at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:31)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:745)

That means the thread is sleeping. 这意味着线程正在休眠。 It is waiting for the acceptor thread to give it work. 它正在等待受体线程使之工作。 It is contributing nothing to your 100% CPU usage. 它对您的100%CPU使用率没有任何贡献。

Conclusion 结论

The runnable threads I saw in your thread dumps show that reading from a network socket is where the CPU load is spent. 我在线程转储中看到的可运行线程表明,从网络套接字读取是花费CPU负载的地方。 I cannot say definitively why. 我不能肯定地说为什么。 There are two possible reasons: a problem with the network or remote system being extremely latent or reading is a problem due to memory usage. 可能有两个原因:网络或远程系统的潜在性极强, 或者由于内存使用而导致读取问题。

Most of the time when I have seen this, the CPU work was coming from the garbage collector, which runs in another thread. 在大多数情况下,我看到的是CPU工作来自垃圾收集器,该垃圾收集器在另一个线程中运行。 When there is pressure to allocate memory and little is available, the garbage collector has to do a lot of expensive work to find available memory while the other threads wait. 当分配内存的压力很大而可用空间很少时,垃圾收集器必须做很多昂贵的工作才能找到可用内存,而其他线程则等待。 This thread might not be visible in the thread dumps you showed since it's not a thread created in Java code, but is an internal part of the Java virtual machine. 在您显示的线程转储中,该线程可能不可见,因为它不是用Java代码创建的线程,而是Java虚拟机的内部部分。 I can't say for sure if that is the issue in your case. 我不能肯定地说这是否是您的问题。 It could be a network problem or problem with the remote system you are reading from. 可能是网络问题或正在读取的远程系统出现问题。 I suggest looking carefully at every thread to try to find a pattern. 我建议仔细检查每个线程以尝试找到模式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM