簡體   English   中英

JVM GC問題

[英]JVM GC problems

在過去的幾周里,我一直在為Glassfish服務器測試不同的JVM設置。 堆(以及其他)的主要設置是:-Xms512m,-Xmx512m,-XX:NewRatio = 2。 我嘗試了不同的GC設置,但是在啟動服務器幾天后我仍然遇到長時間暫停的問題。 我發現以下情況:
1. -XX:+ UseParallelGC -XX:+ UseParallelOldGC - 每分鍾都會發生次要GC,主要GC每18小時發生一次。 我對次要GC沒有任何問題,但5天后主要GC出現問題。 起初主要的GC暫停持續約100-200ms,但最后一次停頓持續70秒。
2. -XX:+ UseConcMarkSweepGC -XX:+ UseParNewGC - 與上面幾乎相同。 次要GC很好,但主要的GC(非完整)暫停時間變得非常長。 我注意到GC(CMS Final Remark)階段的高級卸載問題已經停止了世界階段。
3. -XX:+ UseConcMarkSweepGC -XX:+ UseParNewGC和-XX:MaxGCPauseMillis = 5000。 我只測試了一天,因為第二個主要的GC最后(不完整)已經持續了大約20秒,所以我認為還有其他錯誤。
4. -XX:+ UseG1GC,-XX:MaxGCPauseMillis = 5000,-XX:+ UseStringDeduplication,不帶-XX:NewRatio = 2選項 - 主要GC(未滿)每12小時發生一次,我已經注意到一些問題:

2015-05-31T18:25:25.145+0200: 83383.897: [GC concurrent-mark-start]
2015-05-31T18:25:35.563+0200: 83394.312: [GC concurrent-mark-end, 10.4145795 secs]
2015-05-31T18:25:35.563+0200: 83394.312: [GC remark 83394.312: [Finalize Marking, 0.0002939 secs] 83394.312: [GC ref-proc, 1.2128584 secs] 83395.525: [Unloading, 14.5180500 secs], 15.7320854 secs]
 [Times: user=0.14 sys=0.22, real=15.73 secs] 

GC評論階段耗時15秒,這對我來說是不可接受的。 您可以看到卸載大部分時間都在進行。 這也發生在使用其他GC之前,所以我認為類卸載一定存在問題。

總結:所有GC運行一段時間都沒問題,但幾天后問題開始出現,暫停時間很長。 我不知道為什么它在前幾天工作正常然后突然結果非常糟糕。 我注意到更高的暫停時間是由類卸載引起的,所以我想知道是否有一些設置可以獲得更好的結果。 另外我想知道你推薦給我使用哪種GC? 我有內部Web應用程序在PC上的glassfish服務器上運行,具有8GB的RAM,i7處理器和Windows 8操作系統。 最多可同時連接10個客戶端,但它必須具有較長的正常運行時間,並且不能有很長的暫停時間(最長5秒)。 請告訴我我還能做些什么來縮短暫停時間。

還有一個問題:在我的情況下,使用G1GC而不是CMS或ParallelGC會有什么不利之處? 堆小到使用G1GC?

編輯:G1GC記錄在長時間暫停GC注釋階段之前和之后

2015-05-31T18:25:25.004+0200: 83383.755: [GC pause (G1 Evacuation Pause) (young) (initial-mark), 0.1280453 secs]
   [Parallel Time: 116.2 ms, GC Workers: 4]
      [GC Worker Start (ms): Min: 83383757.6, Avg: 83383757.7, Max: 83383757.7, Diff: 0.0]
      [Ext Root Scanning (ms): Min: 97.8, Avg: 98.3, Max: 98.5, Diff: 0.7, Sum: 393.1]
      [Update RS (ms): Min: 0.2, Avg: 4.0, Max: 14.8, Diff: 14.6, Sum: 16.1]
         [Processed Buffers: Min: 1, Avg: 6.0, Max: 16, Diff: 15, Sum: 24]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.1]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 0.2, Avg: 2.5, Max: 3.7, Diff: 3.5, Sum: 10.2]
      [Termination (ms): Min: 0.0, Avg: 8.5, Max: 11.4, Diff: 11.4, Sum: 34.2]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [GC Worker Total (ms): Min: 113.4, Avg: 113.4, Max: 113.5, Diff: 0.0, Sum: 453.8]
      [GC Worker End (ms): Min: 83383871.1, Avg: 83383871.1, Max: 83383871.1, Diff: 0.0]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [String Dedup Fixup: 2.2 ms, GC Workers: 4]
      [Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Table Fixup (ms): Min: 2.0, Avg: 2.1, Max: 2.1, Diff: 0.1, Sum: 8.3]
   [Clear CT: 0.1 ms]
   [Other: 9.5 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 8.8 ms]
      [Ref Enq: 0.1 ms]
      [Redirty Cards: 0.3 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.1 ms]
   [Eden: 215.0M(215.0M)->0.0B(215.0M) Survivors: 7168.0K->7168.0K Heap: 451.5M(512.0M)->236.6M(512.0M)]
 [Times: user=0.08 sys=0.02, real=0.13 secs] 
2015-05-31T18:25:25.129+0200: 83383.883: [GC concurrent-root-region-scan-start]
2015-05-31T18:25:25.129+0200: 83383.883: [GC concurrent-string-deduplication, 160.0B->0.0B(160.0B), avg 48.3%, 0.0000070 secs]
   [Last Exec: 0.0000070 secs, Idle: 23.1834927 secs, Blocked: 0/0.0000000 secs]
      [Inspected:               3]
         [Skipped:              0(  0.0%)]
         [Hashed:               3(100.0%)]
         [Known:                0(  0.0%)]
         [New:                  3(100.0%)    160.0B]
      [Deduplicated:            3(100.0%)    160.0B(100.0%)]
         [Young:                3(100.0%)    160.0B(100.0%)]
         [Old:                  0(  0.0%)      0.0B(  0.0%)]
   [Total Exec: 2868/0.1946124 secs, Idle: 2868/83382.9701762 secs, Blocked: 13/0.0032760 secs]
      [Inspected:          304493]
         [Skipped:              0(  0.0%)]
         [Hashed:          163708( 53.8%)]
         [Known:            44808( 14.7%)]
         [New:             259685( 85.3%)     21.9M]
      [Deduplicated:       160467( 61.8%)     10.6M( 48.3%)]
         [Young:            83546( 52.1%)   6270.6K( 57.8%)]
         [Old:              76921( 47.9%)   4571.3K( 42.2%)]
   [Table]
      [Memory Usage: 4291.8K]
      [Size: 131072, Min: 1024, Max: 16777216]
      [Entries: 133319, Load: 101.7%, Cached: 6107, Added: 142389, Removed: 9070]
      [Resize Count: 7, Shrink Threshold: 87381(66.7%), Grow Threshold: 262144(200.0%)]
      [Rehash Count: 0, Rehash Threshold: 120, Hash Seed: 0x0]
      [Age Threshold: 3]
   [Queue]
      [Dropped: 0]
2015-05-31T18:25:25.145+0200: 83383.897: [GC concurrent-root-region-scan-end, 0.0140467 secs]
2015-05-31T18:25:25.145+0200: 83383.897: [GC concurrent-mark-start]
2015-05-31T18:25:35.563+0200: 83394.312: [GC concurrent-mark-end, 10.4145795 secs]
2015-05-31T18:25:35.563+0200: 83394.312: [GC remark 83394.312: [Finalize Marking, 0.0002939 secs] 83394.312: [GC ref-proc, 1.2128584 secs] 83395.525: [Unloading, 14.5180500 secs], 15.7320854 secs]
 [Times: user=0.14 sys=0.22, real=15.73 secs] 
2015-05-31T18:25:51.288+0200: 83410.045: [GC cleanup 334M->326M(512M), 0.2836092 secs]
 [Times: user=0.00 sys=0.00, real=0.28 secs] 
2015-05-31T18:25:51.570+0200: 83410.328: [GC concurrent-cleanup-start]
2015-05-31T18:25:51.570+0200: 83410.328: [GC concurrent-cleanup-end, 0.0000669 secs]
2015-05-31T18:26:03.732+0200: 83422.482: [GC pause (G1 Evacuation Pause) (young), 0.1031257 secs]
   [Parallel Time: 91.6 ms, GC Workers: 4]
      [GC Worker Start (ms): Min: 83422481.7, Avg: 83422481.7, Max: 83422481.8, Diff: 0.0]
      [Ext Root Scanning (ms): Min: 1.3, Avg: 1.7, Max: 2.7, Diff: 1.4, Sum: 6.9]
      [Update RS (ms): Min: 0.0, Avg: 22.7, Max: 89.8, Diff: 89.8, Sum: 90.8]
         [Processed Buffers: Min: 0, Avg: 7.3, Max: 15, Diff: 15, Sum: 29]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 0.5, Avg: 2.4, Max: 3.4, Diff: 2.9, Sum: 9.5]
      [Termination (ms): Min: 0.0, Avg: 64.7, Max: 86.3, Diff: 86.3, Sum: 258.9]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [GC Worker Total (ms): Min: 91.6, Avg: 91.6, Max: 91.6, Diff: 0.0, Sum: 366.3]
      [GC Worker End (ms): Min: 83422573.3, Avg: 83422573.3, Max: 83422573.3, Diff: 0.0]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [String Dedup Fixup: 2.1 ms, GC Workers: 4]
      [Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Table Fixup (ms): Min: 1.9, Avg: 1.9, Max: 1.9, Diff: 0.1, Sum: 7.7]
   [Clear CT: 0.1 ms]
   [Other: 9.3 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 8.8 ms]
      [Ref Enq: 0.1 ms]
      [Redirty Cards: 0.0 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.1 ms]
   [Eden: 215.0M(215.0M)->0.0B(19.0M) Survivors: 7168.0K->6144.0K Heap: 443.6M(512.0M)->228.2M(512.0M)]
 [Times: user=0.30 sys=0.01, real=0.10 secs] 
2015-05-31T18:26:03.848+0200: 83422.597: [GC concurrent-string-deduplication, 160.0B->0.0B(160.0B), avg 48.3%, 0.0123951 secs]
   [Last Exec: 0.0123951 secs, Idle: 38.7017788 secs, Blocked: 0/0.0000000 secs]
      [Inspected:               3]
         [Skipped:              0(  0.0%)]
         [Hashed:               3(100.0%)]
         [Known:                0(  0.0%)]
         [New:                  3(100.0%)    160.0B]
      [Deduplicated:            3(100.0%)    160.0B(100.0%)]
         [Young:                3(100.0%)    160.0B(100.0%)]
         [Old:                  0(  0.0%)      0.0B(  0.0%)]
   [Total Exec: 2869/0.2070075 secs, Idle: 2869/83421.6719550 secs, Blocked: 13/0.0032760 secs]
      [Inspected:          304496]
         [Skipped:              0(  0.0%)]
         [Hashed:          163711( 53.8%)]
         [Known:            44808( 14.7%)]
         [New:             259688( 85.3%)     21.9M]
      [Deduplicated:       160470( 61.8%)     10.6M( 48.3%)]
         [Young:            83549( 52.1%)   6270.8K( 57.8%)]
         [Old:              76921( 47.9%)   4571.3K( 42.2%)]
   [Table]
      [Memory Usage: 2565.5K]
      [Size: 65536, Min: 1024, Max: 16777216]
      [Entries: 81061, Load: 123.7%, Cached: 6553, Added: 142396, Removed: 61335]
      [Resize Count: 8, Shrink Threshold: 43690(66.7%), Grow Threshold: 131072(200.0%)]
      [Rehash Count: 0, Rehash Threshold: 120, Hash Seed: 0x0]
      [Age Threshold: 3]
   [Queue]
      [Dropped: 0]
2015-05-31T18:26:05.769+0200: 83424.518: [GC pause (G1 Evacuation Pause) (mixed), 0.2232916 secs]
   [Parallel Time: 216.7 ms, GC Workers: 4]
      [GC Worker Start (ms): Min: 83424518.3, Avg: 83424518.3, Max: 83424518.3, Diff: 0.0]
      [Ext Root Scanning (ms): Min: 1.2, Avg: 1.6, Max: 2.6, Diff: 1.4, Sum: 6.5]
      [Update RS (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 1.2]
         [Processed Buffers: Min: 0, Avg: 4.3, Max: 7, Diff: 7, Sum: 17]
      [Scan RS (ms): Min: 56.1, Avg: 102.3, Max: 144.4, Diff: 88.3, Sum: 409.2]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.3]
      [Object Copy (ms): Min: 50.4, Avg: 97.6, Max: 157.7, Diff: 107.2, Sum: 390.2]
      [Termination (ms): Min: 0.0, Avg: 14.8, Max: 19.8, Diff: 19.8, Sum: 59.1]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [GC Worker Total (ms): Min: 216.6, Avg: 216.6, Max: 216.6, Diff: 0.0, Sum: 866.5]
      [GC Worker End (ms): Min: 83424734.9, Avg: 83424734.9, Max: 83424734.9, Diff: 0.0]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [String Dedup Fixup: 1.5 ms, GC Workers: 4]
      [Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Table Fixup (ms): Min: 1.4, Avg: 1.4, Max: 1.4, Diff: 0.0, Sum: 5.6]
   [Clear CT: 0.2 ms]
   [Other: 4.8 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.9 ms]
      [Ref Enq: 0.0 ms]
      [Redirty Cards: 0.2 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.2 ms]
   [Eden: 19.0M(19.0M)->0.0B(21.0M) Survivors: 6144.0K->4096.0K Heap: 247.2M(512.0M)->175.2M(512.0M)]
 [Times: user=0.09 sys=0.00, real=0.22 secs] 
2015-05-31T18:26:05.992+0200: 83424.742: [GC concurrent-string-deduplication, 640.0B->152.0B(488.0B), avg 48.3%, 0.0000246 secs]
   [Last Exec: 0.0000246 secs, Idle: 2.1442834 secs, Blocked: 0/0.0000000 secs]
      [Inspected:               6]
         [Skipped:              0(  0.0%)]
         [Hashed:               5( 83.3%)]
         [Known:                0(  0.0%)]
         [New:                  6(100.0%)    640.0B]
      [Deduplicated:            5( 83.3%)    488.0B( 76.3%)]
         [Young:                5(100.0%)    488.0B(100.0%)]
         [Old:                  0(  0.0%)      0.0B(  0.0%)]
   [Total Exec: 2870/0.2070321 secs, Idle: 2870/83423.8162384 secs, Blocked: 13/0.0032760 secs]
      [Inspected:          304502]
         [Skipped:              0(  0.0%)]
         [Hashed:          163716( 53.8%)]
         [Known:            44808( 14.7%)]
         [New:             259694( 85.3%)     21.9M]
      [Deduplicated:       160475( 61.8%)     10.6M( 48.3%)]
         [Young:            83554( 52.1%)   6271.2K( 57.8%)]
         [Old:              76921( 47.9%)   4571.3K( 42.2%)]
   [Table]
      [Memory Usage: 2564.6K]
      [Size: 65536, Min: 1024, Max: 16777216]
      [Entries: 81026, Load: 123.6%, Cached: 6553, Added: 142397, Removed: 61371]
      [Resize Count: 8, Shrink Threshold: 43690(66.7%), Grow Threshold: 131072(200.0%)]
      [Rehash Count: 0, Rehash Threshold: 120, Hash Seed: 0x0]
      [Age Threshold: 3]
   [Queue]
      [Dropped: 0]
2015-05-31T18:26:08.157+0200: 83426.906: [GC pause (G1 Evacuation Pause) (mixed), 0.6216666 secs]
   [Parallel Time: 618.5 ms, GC Workers: 4]
      [GC Worker Start (ms): Min: 83426906.5, Avg: 83426906.5, Max: 83426906.5, Diff: 0.0]
      [Ext Root Scanning (ms): Min: 0.3, Avg: 8.0, Max: 15.7, Diff: 15.3, Sum: 31.9]
      [Update RS (ms): Min: 0.0, Avg: 4.5, Max: 8.5, Diff: 8.5, Sum: 17.9]
         [Processed Buffers: Min: 0, Avg: 7.0, Max: 18, Diff: 18, Sum: 28]
      [Scan RS (ms): Min: 13.4, Avg: 28.4, Max: 65.2, Diff: 51.8, Sum: 113.7]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2]
      [Object Copy (ms): Min: 532.6, Avg: 577.3, Max: 604.5, Diff: 71.9, Sum: 2309.1]
      [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 0.7]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [GC Worker Total (ms): Min: 618.4, Avg: 618.4, Max: 618.4, Diff: 0.0, Sum: 2473.6]
      [GC Worker End (ms): Min: 83427524.9, Avg: 83427524.9, Max: 83427524.9, Diff: 0.0]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [String Dedup Fixup: 1.3 ms, GC Workers: 4]
      [Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Table Fixup (ms): Min: 1.2, Avg: 1.2, Max: 1.3, Diff: 0.1, Sum: 4.9]
   [Clear CT: 0.1 ms]
   [Other: 1.6 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 1.0 ms]
      [Ref Enq: 0.0 ms]
      [Redirty Cards: 0.0 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.2 ms]
   [Eden: 21.0M(21.0M)->0.0B(21.0M) Survivors: 4096.0K->4096.0K Heap: 196.2M(512.0M)->129.4M(512.0M)]
 [Times: user=0.08 sys=0.02, real=0.62 secs] 

編輯:幾小時后的結果:

在此輸入圖像描述

在此輸入圖像描述

有很多頁/秒和頁面輸入/秒以及頁面錯誤。 這是正常的嗎? 我在哪里可以設置監視頁面/秒和頁面輸入/秒僅適用於JVM(我只發現頁面錯誤)?

我猜你正在咆哮錯誤的樹 - 我懷疑垃圾收集不是你的問題......

你只運行一個512 MiB堆 - 對我來說,一個長時間停頓的堆大小將是1或2秒。 巨大的(32 GiB)堆可能在幾毫秒內停止。

我希望問題實際上與您的服務器有關 - 您提到的其他應用程序正在使用足夠的內存將您的Java進程(比堆大約50%)推入交換/虛擬內存 - 或者您'在虛擬化環境中運行您的應用程序(可能存在內存過量使用/內存緩沖問題)。

作為一個非常粗略的指標,任何GC算法都應該能夠以每秒100 MiB的速度流失 - 所以如果你看到的情況比這更糟糕,那就去找別處找問題。

在這種情況下,我認為GC是症狀,而不是問題。

[時間:用戶= 0.14 sys = 0.22,實際= 15.73秒]

這意味着它花費了更多的壁時間GCing而不是實際的CPU時間。 您嘗試過的所有GC都是多線程的,純粹是CPU限制的,這意味着它們通常應該比壁時間消耗更多的CPU時間。

有兩種可能的原因會改變這種情況

  • 資源匱乏,堆的任何一部分被換出或其他進程占用CPU。 正如@Mikaveli建議的那樣,使用操作系統監控工具來獲取更多信息。
  • 有些東西會阻止VM在幾秒鍾內達到安全點。 您可以使用-XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1檢查-XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1

另外,因為它發生在使用-XX:-ClassUnloadingWithConcurrentMark一個備注暫停期間-XX:-ClassUnloadingWithConcurrentMark可能會修復它,但我想它只會將問題轉移到常規GC。

也許跟蹤實際嘗試使用-XX:+TraceClassUnloading卸載多少是有用的-XX:+TraceClassUnloading 像glassfish這樣的應用程序容器可能會做一些奇怪的事情,導致很多類堆積。

編輯:對於監控,您最需要關注的是免費物理內存(減去緩存), CPU負載 ,頁面輸入/輸出/故障。 理想情況下,在進程級別監視分頁,因為它與JVM無關,是否有另一個進程在等待磁盤。

至於CMS與G1:這可能與您的問題無關。

看一下這篇文章,你能告訴你在哪個主要領域GC的長時間暫停會落在你的情況下(文章有關於如何確定這一點的步驟)? 不成? 其他OS活動?

此外,可能是由於JVM錯誤。 檢查您擁有的JVM版本。

鏈接: https//blogs.oracle.com/poonam/entry/troubleshooting_long_gc_pauses

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM