繁体   English   中英

JVM GC问题

[英]JVM GC problems

在过去的几周里,我一直在为Glassfish服务器测试不同的JVM设置。 堆(以及其他)的主要设置是:-Xms512m,-Xmx512m,-XX:NewRatio = 2。 我尝试了不同的GC设置,但是在启动服务器几天后我仍然遇到长时间暂停的问题。 我发现以下情况:
1. -XX:+ UseParallelGC -XX:+ UseParallelOldGC - 每分钟都会发生次要GC,主要GC每18小时发生一次。 我对次要GC没有任何问题,但5天后主要GC出现问题。 起初主要的GC暂停持续约100-200ms,但最后一次停顿持续70秒。
2. -XX:+ UseConcMarkSweepGC -XX:+ UseParNewGC - 与上面几乎相同。 次要GC很好,但主要的GC(非完整)暂停时间变得非常长。 我注意到GC(CMS Final Remark)阶段的高级卸载问题已经停止了世界阶段。
3. -XX:+ UseConcMarkSweepGC -XX:+ UseParNewGC和-XX:MaxGCPauseMillis = 5000。 我只测试了一天,因为第二个主要的GC最后(不完整)已经持续了大约20秒,所以我认为还有其他错误。
4. -XX:+ UseG1GC,-XX:MaxGCPauseMillis = 5000,-XX:+ UseStringDeduplication,不带-XX:NewRatio = 2选项 - 主要GC(未满)每12小时发生一次,我已经注意到一些问题:

2015-05-31T18:25:25.145+0200: 83383.897: [GC concurrent-mark-start]
2015-05-31T18:25:35.563+0200: 83394.312: [GC concurrent-mark-end, 10.4145795 secs]
2015-05-31T18:25:35.563+0200: 83394.312: [GC remark 83394.312: [Finalize Marking, 0.0002939 secs] 83394.312: [GC ref-proc, 1.2128584 secs] 83395.525: [Unloading, 14.5180500 secs], 15.7320854 secs]
 [Times: user=0.14 sys=0.22, real=15.73 secs] 

GC评论阶段耗时15秒,这对我来说是不可接受的。 您可以看到卸载大部分时间都在进行。 这也发生在使用其他GC之前,所以我认为类卸载一定存在问题。

总结:所有GC运行一段时间都没问题,但几天后问题开始出现,暂停时间很长。 我不知道为什么它在前几天工作正常然后突然结果非常糟糕。 我注意到更高的暂停时间是由类卸载引起的,所以我想知道是否有一些设置可以获得更好的结果。 另外我想知道你推荐给我使用哪种GC? 我有内部Web应用程序在PC上的glassfish服务器上运行,具有8GB的RAM,i7处理器和Windows 8操作系统。 最多可同时连接10个客户端,但它必须具有较长的正常运行时间,并且不能有很长的暂停时间(最长5秒)。 请告诉我我还能做些什么来缩短暂停时间。

还有一个问题:在我的情况下,使用G1GC而不是CMS或ParallelGC会有什么不利之处? 堆小到使用G1GC?

编辑:G1GC记录在长时间暂停GC注释阶段之前和之后

2015-05-31T18:25:25.004+0200: 83383.755: [GC pause (G1 Evacuation Pause) (young) (initial-mark), 0.1280453 secs]
   [Parallel Time: 116.2 ms, GC Workers: 4]
      [GC Worker Start (ms): Min: 83383757.6, Avg: 83383757.7, Max: 83383757.7, Diff: 0.0]
      [Ext Root Scanning (ms): Min: 97.8, Avg: 98.3, Max: 98.5, Diff: 0.7, Sum: 393.1]
      [Update RS (ms): Min: 0.2, Avg: 4.0, Max: 14.8, Diff: 14.6, Sum: 16.1]
         [Processed Buffers: Min: 1, Avg: 6.0, Max: 16, Diff: 15, Sum: 24]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.1]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 0.2, Avg: 2.5, Max: 3.7, Diff: 3.5, Sum: 10.2]
      [Termination (ms): Min: 0.0, Avg: 8.5, Max: 11.4, Diff: 11.4, Sum: 34.2]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [GC Worker Total (ms): Min: 113.4, Avg: 113.4, Max: 113.5, Diff: 0.0, Sum: 453.8]
      [GC Worker End (ms): Min: 83383871.1, Avg: 83383871.1, Max: 83383871.1, Diff: 0.0]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [String Dedup Fixup: 2.2 ms, GC Workers: 4]
      [Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Table Fixup (ms): Min: 2.0, Avg: 2.1, Max: 2.1, Diff: 0.1, Sum: 8.3]
   [Clear CT: 0.1 ms]
   [Other: 9.5 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 8.8 ms]
      [Ref Enq: 0.1 ms]
      [Redirty Cards: 0.3 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.1 ms]
   [Eden: 215.0M(215.0M)->0.0B(215.0M) Survivors: 7168.0K->7168.0K Heap: 451.5M(512.0M)->236.6M(512.0M)]
 [Times: user=0.08 sys=0.02, real=0.13 secs] 
2015-05-31T18:25:25.129+0200: 83383.883: [GC concurrent-root-region-scan-start]
2015-05-31T18:25:25.129+0200: 83383.883: [GC concurrent-string-deduplication, 160.0B->0.0B(160.0B), avg 48.3%, 0.0000070 secs]
   [Last Exec: 0.0000070 secs, Idle: 23.1834927 secs, Blocked: 0/0.0000000 secs]
      [Inspected:               3]
         [Skipped:              0(  0.0%)]
         [Hashed:               3(100.0%)]
         [Known:                0(  0.0%)]
         [New:                  3(100.0%)    160.0B]
      [Deduplicated:            3(100.0%)    160.0B(100.0%)]
         [Young:                3(100.0%)    160.0B(100.0%)]
         [Old:                  0(  0.0%)      0.0B(  0.0%)]
   [Total Exec: 2868/0.1946124 secs, Idle: 2868/83382.9701762 secs, Blocked: 13/0.0032760 secs]
      [Inspected:          304493]
         [Skipped:              0(  0.0%)]
         [Hashed:          163708( 53.8%)]
         [Known:            44808( 14.7%)]
         [New:             259685( 85.3%)     21.9M]
      [Deduplicated:       160467( 61.8%)     10.6M( 48.3%)]
         [Young:            83546( 52.1%)   6270.6K( 57.8%)]
         [Old:              76921( 47.9%)   4571.3K( 42.2%)]
   [Table]
      [Memory Usage: 4291.8K]
      [Size: 131072, Min: 1024, Max: 16777216]
      [Entries: 133319, Load: 101.7%, Cached: 6107, Added: 142389, Removed: 9070]
      [Resize Count: 7, Shrink Threshold: 87381(66.7%), Grow Threshold: 262144(200.0%)]
      [Rehash Count: 0, Rehash Threshold: 120, Hash Seed: 0x0]
      [Age Threshold: 3]
   [Queue]
      [Dropped: 0]
2015-05-31T18:25:25.145+0200: 83383.897: [GC concurrent-root-region-scan-end, 0.0140467 secs]
2015-05-31T18:25:25.145+0200: 83383.897: [GC concurrent-mark-start]
2015-05-31T18:25:35.563+0200: 83394.312: [GC concurrent-mark-end, 10.4145795 secs]
2015-05-31T18:25:35.563+0200: 83394.312: [GC remark 83394.312: [Finalize Marking, 0.0002939 secs] 83394.312: [GC ref-proc, 1.2128584 secs] 83395.525: [Unloading, 14.5180500 secs], 15.7320854 secs]
 [Times: user=0.14 sys=0.22, real=15.73 secs] 
2015-05-31T18:25:51.288+0200: 83410.045: [GC cleanup 334M->326M(512M), 0.2836092 secs]
 [Times: user=0.00 sys=0.00, real=0.28 secs] 
2015-05-31T18:25:51.570+0200: 83410.328: [GC concurrent-cleanup-start]
2015-05-31T18:25:51.570+0200: 83410.328: [GC concurrent-cleanup-end, 0.0000669 secs]
2015-05-31T18:26:03.732+0200: 83422.482: [GC pause (G1 Evacuation Pause) (young), 0.1031257 secs]
   [Parallel Time: 91.6 ms, GC Workers: 4]
      [GC Worker Start (ms): Min: 83422481.7, Avg: 83422481.7, Max: 83422481.8, Diff: 0.0]
      [Ext Root Scanning (ms): Min: 1.3, Avg: 1.7, Max: 2.7, Diff: 1.4, Sum: 6.9]
      [Update RS (ms): Min: 0.0, Avg: 22.7, Max: 89.8, Diff: 89.8, Sum: 90.8]
         [Processed Buffers: Min: 0, Avg: 7.3, Max: 15, Diff: 15, Sum: 29]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 0.5, Avg: 2.4, Max: 3.4, Diff: 2.9, Sum: 9.5]
      [Termination (ms): Min: 0.0, Avg: 64.7, Max: 86.3, Diff: 86.3, Sum: 258.9]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [GC Worker Total (ms): Min: 91.6, Avg: 91.6, Max: 91.6, Diff: 0.0, Sum: 366.3]
      [GC Worker End (ms): Min: 83422573.3, Avg: 83422573.3, Max: 83422573.3, Diff: 0.0]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [String Dedup Fixup: 2.1 ms, GC Workers: 4]
      [Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Table Fixup (ms): Min: 1.9, Avg: 1.9, Max: 1.9, Diff: 0.1, Sum: 7.7]
   [Clear CT: 0.1 ms]
   [Other: 9.3 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 8.8 ms]
      [Ref Enq: 0.1 ms]
      [Redirty Cards: 0.0 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.1 ms]
   [Eden: 215.0M(215.0M)->0.0B(19.0M) Survivors: 7168.0K->6144.0K Heap: 443.6M(512.0M)->228.2M(512.0M)]
 [Times: user=0.30 sys=0.01, real=0.10 secs] 
2015-05-31T18:26:03.848+0200: 83422.597: [GC concurrent-string-deduplication, 160.0B->0.0B(160.0B), avg 48.3%, 0.0123951 secs]
   [Last Exec: 0.0123951 secs, Idle: 38.7017788 secs, Blocked: 0/0.0000000 secs]
      [Inspected:               3]
         [Skipped:              0(  0.0%)]
         [Hashed:               3(100.0%)]
         [Known:                0(  0.0%)]
         [New:                  3(100.0%)    160.0B]
      [Deduplicated:            3(100.0%)    160.0B(100.0%)]
         [Young:                3(100.0%)    160.0B(100.0%)]
         [Old:                  0(  0.0%)      0.0B(  0.0%)]
   [Total Exec: 2869/0.2070075 secs, Idle: 2869/83421.6719550 secs, Blocked: 13/0.0032760 secs]
      [Inspected:          304496]
         [Skipped:              0(  0.0%)]
         [Hashed:          163711( 53.8%)]
         [Known:            44808( 14.7%)]
         [New:             259688( 85.3%)     21.9M]
      [Deduplicated:       160470( 61.8%)     10.6M( 48.3%)]
         [Young:            83549( 52.1%)   6270.8K( 57.8%)]
         [Old:              76921( 47.9%)   4571.3K( 42.2%)]
   [Table]
      [Memory Usage: 2565.5K]
      [Size: 65536, Min: 1024, Max: 16777216]
      [Entries: 81061, Load: 123.7%, Cached: 6553, Added: 142396, Removed: 61335]
      [Resize Count: 8, Shrink Threshold: 43690(66.7%), Grow Threshold: 131072(200.0%)]
      [Rehash Count: 0, Rehash Threshold: 120, Hash Seed: 0x0]
      [Age Threshold: 3]
   [Queue]
      [Dropped: 0]
2015-05-31T18:26:05.769+0200: 83424.518: [GC pause (G1 Evacuation Pause) (mixed), 0.2232916 secs]
   [Parallel Time: 216.7 ms, GC Workers: 4]
      [GC Worker Start (ms): Min: 83424518.3, Avg: 83424518.3, Max: 83424518.3, Diff: 0.0]
      [Ext Root Scanning (ms): Min: 1.2, Avg: 1.6, Max: 2.6, Diff: 1.4, Sum: 6.5]
      [Update RS (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 1.2]
         [Processed Buffers: Min: 0, Avg: 4.3, Max: 7, Diff: 7, Sum: 17]
      [Scan RS (ms): Min: 56.1, Avg: 102.3, Max: 144.4, Diff: 88.3, Sum: 409.2]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.3]
      [Object Copy (ms): Min: 50.4, Avg: 97.6, Max: 157.7, Diff: 107.2, Sum: 390.2]
      [Termination (ms): Min: 0.0, Avg: 14.8, Max: 19.8, Diff: 19.8, Sum: 59.1]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [GC Worker Total (ms): Min: 216.6, Avg: 216.6, Max: 216.6, Diff: 0.0, Sum: 866.5]
      [GC Worker End (ms): Min: 83424734.9, Avg: 83424734.9, Max: 83424734.9, Diff: 0.0]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [String Dedup Fixup: 1.5 ms, GC Workers: 4]
      [Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Table Fixup (ms): Min: 1.4, Avg: 1.4, Max: 1.4, Diff: 0.0, Sum: 5.6]
   [Clear CT: 0.2 ms]
   [Other: 4.8 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.9 ms]
      [Ref Enq: 0.0 ms]
      [Redirty Cards: 0.2 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.2 ms]
   [Eden: 19.0M(19.0M)->0.0B(21.0M) Survivors: 6144.0K->4096.0K Heap: 247.2M(512.0M)->175.2M(512.0M)]
 [Times: user=0.09 sys=0.00, real=0.22 secs] 
2015-05-31T18:26:05.992+0200: 83424.742: [GC concurrent-string-deduplication, 640.0B->152.0B(488.0B), avg 48.3%, 0.0000246 secs]
   [Last Exec: 0.0000246 secs, Idle: 2.1442834 secs, Blocked: 0/0.0000000 secs]
      [Inspected:               6]
         [Skipped:              0(  0.0%)]
         [Hashed:               5( 83.3%)]
         [Known:                0(  0.0%)]
         [New:                  6(100.0%)    640.0B]
      [Deduplicated:            5( 83.3%)    488.0B( 76.3%)]
         [Young:                5(100.0%)    488.0B(100.0%)]
         [Old:                  0(  0.0%)      0.0B(  0.0%)]
   [Total Exec: 2870/0.2070321 secs, Idle: 2870/83423.8162384 secs, Blocked: 13/0.0032760 secs]
      [Inspected:          304502]
         [Skipped:              0(  0.0%)]
         [Hashed:          163716( 53.8%)]
         [Known:            44808( 14.7%)]
         [New:             259694( 85.3%)     21.9M]
      [Deduplicated:       160475( 61.8%)     10.6M( 48.3%)]
         [Young:            83554( 52.1%)   6271.2K( 57.8%)]
         [Old:              76921( 47.9%)   4571.3K( 42.2%)]
   [Table]
      [Memory Usage: 2564.6K]
      [Size: 65536, Min: 1024, Max: 16777216]
      [Entries: 81026, Load: 123.6%, Cached: 6553, Added: 142397, Removed: 61371]
      [Resize Count: 8, Shrink Threshold: 43690(66.7%), Grow Threshold: 131072(200.0%)]
      [Rehash Count: 0, Rehash Threshold: 120, Hash Seed: 0x0]
      [Age Threshold: 3]
   [Queue]
      [Dropped: 0]
2015-05-31T18:26:08.157+0200: 83426.906: [GC pause (G1 Evacuation Pause) (mixed), 0.6216666 secs]
   [Parallel Time: 618.5 ms, GC Workers: 4]
      [GC Worker Start (ms): Min: 83426906.5, Avg: 83426906.5, Max: 83426906.5, Diff: 0.0]
      [Ext Root Scanning (ms): Min: 0.3, Avg: 8.0, Max: 15.7, Diff: 15.3, Sum: 31.9]
      [Update RS (ms): Min: 0.0, Avg: 4.5, Max: 8.5, Diff: 8.5, Sum: 17.9]
         [Processed Buffers: Min: 0, Avg: 7.0, Max: 18, Diff: 18, Sum: 28]
      [Scan RS (ms): Min: 13.4, Avg: 28.4, Max: 65.2, Diff: 51.8, Sum: 113.7]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2]
      [Object Copy (ms): Min: 532.6, Avg: 577.3, Max: 604.5, Diff: 71.9, Sum: 2309.1]
      [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 0.7]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [GC Worker Total (ms): Min: 618.4, Avg: 618.4, Max: 618.4, Diff: 0.0, Sum: 2473.6]
      [GC Worker End (ms): Min: 83427524.9, Avg: 83427524.9, Max: 83427524.9, Diff: 0.0]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [String Dedup Fixup: 1.3 ms, GC Workers: 4]
      [Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Table Fixup (ms): Min: 1.2, Avg: 1.2, Max: 1.3, Diff: 0.1, Sum: 4.9]
   [Clear CT: 0.1 ms]
   [Other: 1.6 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 1.0 ms]
      [Ref Enq: 0.0 ms]
      [Redirty Cards: 0.0 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.2 ms]
   [Eden: 21.0M(21.0M)->0.0B(21.0M) Survivors: 4096.0K->4096.0K Heap: 196.2M(512.0M)->129.4M(512.0M)]
 [Times: user=0.08 sys=0.02, real=0.62 secs] 

编辑:几小时后的结果:

在此输入图像描述

在此输入图像描述

有很多页/秒和页面输入/秒以及页面错误。 这是正常的吗? 我在哪里可以设置监视页面/秒和页面输入/秒仅适用于JVM(我只发现页面错误)?

我猜你正在咆哮错误的树 - 我怀疑垃圾收集不是你的问题......

你只运行一个512 MiB堆 - 对我来说,一个长时间停顿的堆大小将是1或2秒。 巨大的(32 GiB)堆可能在几毫秒内停止。

我希望问题实际上与您的服务器有关 - 您提到的其他应用程序正在使用足够的内存将您的Java进程(比堆大约50%)推入交换/虚拟内存 - 或者您'在虚拟化环境中运行您的应用程序(可能存在内存过量使用/内存缓冲问题)。

作为一个非常粗略的指标,任何GC算法都应该能够以每秒100 MiB的速度流失 - 所以如果你看到的情况比这更糟糕,那就去找别处找问题。

在这种情况下,我认为GC是症状,而不是问题。

[时间:用户= 0.14 sys = 0.22,实际= 15.73秒]

这意味着它花费了更多的壁时间GCing而不是实际的CPU时间。 您尝试过的所有GC都是多线程的,纯粹是CPU限制的,这意味着它们通常应该比壁时间消耗更多的CPU时间。

有两种可能的原因会改变这种情况

  • 资源匮乏,堆的任何一部分被换出或其他进程占用CPU。 正如@Mikaveli建议的那样,使用操作系统监控工具来获取更多信息。
  • 有些东西会阻止VM在几秒钟内达到安全点。 您可以使用-XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1检查-XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1

另外,因为它发生在使用-XX:-ClassUnloadingWithConcurrentMark一个备注暂停期间-XX:-ClassUnloadingWithConcurrentMark可能会修复它,但我想它只会将问题转移到常规GC。

也许跟踪实际尝试使用-XX:+TraceClassUnloading卸载多少是有用的-XX:+TraceClassUnloading 像glassfish这样的应用程序容器可能会做一些奇怪的事情,导致很多类堆积。

编辑:对于监控,您最需要关注的是免费物理内存(减去缓存), CPU负载 ,页面输入/输出/故障。 理想情况下,在进程级别监视分页,因为它与JVM无关,是否有另一个进程在等待磁盘。

至于CMS与G1:这可能与您的问题无关。

看一下这篇文章,你能告诉你在哪个主要领域GC的长时间暂停会落在你的情况下(文章有关于如何确定这一点的步骤)? 不成? 其他OS活动?

此外,可能是由于JVM错误。 检查您拥有的JVM版本。

链接: https//blogs.oracle.com/poonam/entry/troubleshooting_long_gc_pauses

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM