简体   繁体   English

java web应用的高线程上下文切换

[英]High thread context switching for java web application

We have been load testing our java web application and observe high cpu usage with 50 users (which doesn't seem practical).我们一直在对我们的 java web 应用程序进行负载测试,并观察到 ​​50 个用户的高 CPU 使用率(这似乎不切实际)。 The CPU shoots up above 80%. CPU 飙升至 80% 以上。 While profiling it with java flight recording (JFR) we see that the context switch rate is 8400 per second (as seen in the Hot threads tab on java mission control).在使用 java 飞行记录 (JFR) 对其进行分析时,我们看到上下文切换速率为每秒 8400 次(如 java 任务控制的热线程选项卡中所示)。 Analyzing the hot threads in jfr, it seems the cpu usage is distributed across the application threads with each thread using less than 3% cpu.分析 jfr 中的热线程,似乎 cpu 使用率分布在应用程序线程中,每个线程使用不到 3% 的 cpu。

Increasing the user load to 100, 150 or 200 users we see the cpu shooting up above 90%, the throughput (transactions per second) remaining constant (as seen for 50 users load) while the response time crosses the acceptable threshold values (3 sec).将用户负载增加到 100、150 或 200 个用户,我们看到 CPU 飙升至 90% 以上,吞吐量(每秒事务数)保持不变(如 50 个用户负载所见),而响应时间超过可接受的阈值(3 秒) )。 Decreasing the user load to 20 users shows the cpu usage averages out to be above 55%.将用户负载减少到 20 个用户显示 CPU 使用率平均高于 55%。 It certainly isn't true that the application threads are using up the cpu since our application is not a CPU bound application.由于我们的应用程序不是 CPU 密集型应用程序,因此应用程序线程耗尽 CPU 肯定是不正确的。 The Hot Packages tab under Code tab group confirms this by showing that most of the time the application spends in is executing database queries. Code 选项卡组下的 Hot Packages 选项卡通过显示应用程序花费的大部分时间都在执行数据库查询来确认这一点。

We use glassfish 3.1.2.2 as our application server where the max thread pool is configured to be of 100. Oracle Linux Server release 6.4 is our operating system with linux kernel version as 2.6.39-400.214.4.el6uek.x86_64.我们使用 glassfish 3.1.2.2 作为我们的应用服务器,其中最大线程池配置为 100。Oracle Linux Server 6.4 版是我们的操作系统,Linux 内核版本为 2.6.39-400.214.4.el6uek.x86_64。 I tried executing linux commands namely "watch -n0.5 pidstat -w -I -p " and "watch -n.5 grep ctxt /proc//status" to see the voluntary and involuntary thread context switching at OS level but they don't give any results.我尝试执行 linux 命令,即“watch -n0.5 pidstat -w -I -p”和“watch -n.5 grep ctxt /proc//status”以查看操作系统级别的自愿和非自愿线程上下文切换,但他们没有不给任何结果。

Suspecting that high context switching could be causing the cpu to shoot up, do you have guidelines on what could be done to confirm that thread context switching is the cause of high cpu and what are there ways to tune the jvm or the application if that's the cause?怀疑高上下文切换可能导致 cpu 飙升,您是否有关于可以做什么来确认线程上下文切换是导致高 cpu 的原因的指南,以及有什么方法可以调整 jvm 或应用程序(如果是的话)原因?

Thanks!谢谢!

You can use performance counters to the number of context switch in a operation.您可以使用性能计数器来计算一个操作中的上下文切换次数。 In order to do so, uses the application perf.为此,请使用应用程序 perf。

The command should be perf stats -e cs <command> .命令应该是perf stats -e cs <command> This is an example:这是一个例子:

[breno@debra ~]$ sudo perf stat  -e cs ls > /dev/null

Performance counter stats for 'ls':
   0    cs   (context switch)                                             

   0.001932855 seconds time elapsed

[breno@debra ~]$ sudo perf stat  -e cs ls -R > /dev/null

Performance counter stats for 'ls -R':

   3,130   cs (context switch)                                                        

   3.537120431 seconds time elapsed

I know it's an old one, but for the sake of whoever is dealing with the same issue.我知道这是一个旧的,但为了处理同样问题的人。
The command: pidstat -wt 3, will give you the granularity of thread specific context switch.命令:pidstat -wt 3,将为您提供线程特定上下文切换的粒度。 Then you can do a thread dump to your java process, And search for the thread number which you see high context switch.然后你可以对你的java进程做一个线程转储,并搜索你看到高上下文切换的线程号。 (you might need to translate the thread number to hex, depending on your thread dump output). (您可能需要将线程号转换为十六进制,具体取决于您的线程转储输出)。 We're still not sure what's the core problem though, cause the thread with the highest context switch is pointing to:我们仍然不确定核心问题是什么,导致具有最高上下文切换的线程指向:

"NioBlockingSelector.BlockPoller-1" #37 daemon prio=5 os_prio=0 tid=0x00007f2b60b1f000 nid=0x1f48 runnable [0x00007f2b40af6000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked <0x0000000700ae6c78> (a sun.nio.ch.Util$3)
        - locked <0x0000000700ae6c68> (a java.util.Collections$UnmodifiableSet)
        - locked <0x0000000700ae6b30> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSelector.java:298)

And it makes sense, cause it's a thread selector, but not sure how to continue from here :)这是有道理的,因为它是一个线程选择器,但不知道如何从这里继续:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM