简体   繁体   English

如何调整JVM 8 G1参数以避免Full GC(分配失败)

[英]How to tuning JVM 8 G1 argument to avoid Full GC (Allocation Failure)

When using the G1 GC in a server with XMX of 8 GB, Full GC failure happens after running some days. 在XMX为8 GB的服务器中使用G1 GC时,运行几天后会发生完全GC故障。

Try to adjust the JVM GC arguments several around, and print out all the GC details but still can not identify the root cause 尝试多次调整JVM GC参数,并打印出所有GC详细信息,但仍无法确定根本原因

JVM arguments : JVM参数

java -Xms8g -Xmx8g 
-XX:+CrashOnOutOfMemoryError 
-XX:+AlwaysPreTouch 
-XX:-UseBiasedLocking 
-XX:MaxTenuringThreshold=15 
-Xss256k 
-XX:SurvivorRatio=6 
-XX:+UseTLAB 
-XX:GCTimeRatio=4 
-XX:+ScavengeBeforeFullGC 
-XX:G1HeapRegionSize=8M 
-XX:ConcGCThreads=8 
-XX:G1HeapWastePercent=10 
-XX:+AggressiveOpts 
-XX:MaxMetaspaceSize=256m 
-XX:+UseG1GC 
-XX:InitiatingHeapOccupancyPercent=35 
-XX:+DisableExplicitGC 
-Xloggc:/var/tmp/prod/query/Portfolio/PORTFOLIO-QRY-A-Instance1/query-gc.log
-verbose:gc 
-XX:+PrintGCDetails 
-XX:+PrintGCDateStamps 
-XX:+PrintGCTimeStamps 
-XX:+UseGCLogFileRotation 
-XX:NumberOfGCLogFiles=10 
-XX:GCLogFileSize=100M 
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/var/tmp/prod/query/xxx.log-XX:NewSize=3g 
-XX:MaxNewSize=5g 
-server

The last GC details: 最后的GC详细信息:

2019-01-25T00:25:28.998+0800: 399236.910: [GC pause (G1 Evacuation Pause) (young) (initial-mark) (to-space exhausted), 0.3400461 secs] 2019-01-25T00:25:28.998 + 0800:399236.910:[GC暂停(G1疏散暂停)(年轻)(初始标记)(至太空用尽),0.3400461秒]]
[Eden: 1080.0M(3072.0M)->0.0B(3072.0M) Survivors: 0.0B->0.0B Heap: 8144.0M(8192.0M)->7936.0M(8192.0M)] [Times: user=2.22 sys=0.01, real=0.34 secs] 2019-01-25T00:25:29.338+0800: 399237.251: [GC concurrent-root-region-scan-start] 2019-01-25T00:25:29.338+0800: 399237.251: [GC concurrent-root-region-scan-end, 0.0000869 secs] 2019-01-25T00:25:29.338+0800: 399237.251: [GC concurrent-mark-start] 2019-01-25T00:25:29.419+0800: 399237.332: [GC pause (G1 Evacuation Pause) (young) (to-space exhausted), 0.2033834 secs] [Eden: 208.0M(3072.0M)->0.0B(3072.0M) Survivors: 0.0B->0.0B Heap: 8144.0M(8192.0M)->8144.0M(8192.0M)] [Times: user=0.67 sys=0.02, real=0.21 secs] 2019-01-25T00:25:29.624+0800: 399237.537: [GC pause (G1 Evacuation Pause) (young), 0.0076649 secs] [Eden: 0.0B(3072.0M)->0.0B(3072.0M) Survivors: 0.0B->0.0B Heap: 8144.0M(8192.0M)->8144.0M(8192.0M)] [Times: user=0.07 sys=0.00, real=0.01 secs] 2019-01-25T00:25:29.632+0800: 399237.545: [GC pause (G1 Evacuation Pause) (young), 0.0072213 secs] [Eden:1080.0M(3072.0M)-> 0.0B(3072.0M)幸存者:0.0B-> 0.0B堆:8144.0M(8192.0M)-> 7936.0M(8192.0M)] [时间:user = 2.22 sys = 0.01,真实= 0.34秒] 2019-01-25T00:25:29.338 + 0800:399237.251:[GC并发根区域扫描开始] 2019-01-25T00:25:29.338 + 0800:399237.251:[GC并发-root-region-scan-end,0.0000869秒] 2019-01-25T00:25:29.338 + 0800:399237.251:[GC并发标记开始] 2019-01-25T00:25:29.419 + 0800:399237.332:[GC暂停(G1疏散暂停)(年轻)(到太空用尽),0.2033834秒] [Eden:208.0M(3072.0M)-> 0.0B(3072.0M)幸存者:0.0B-> 0.0B堆:8144.0M(8192.0) M)-> 8144.0M(8192.0M)] [时间:用户= 0.67 sys = 0.02,真实= 0.21秒] 2019-01-25T00:25:29.624 + 0800:399237.537:[GC暂停(G1疏散暂停)(年轻),0.0076649 secs] [Eden:0.0B(3072.0M)-> 0.0B(3072.0M)幸存者:0.0B-> 0.0B堆:8144.0M(8192.0M)-> 8144.0M(8192.0M)] [时间: user = 0.07 sys = 0.00,real = 0.01 secs] 2019-01-25T00:25:29.632 + 0800:399237.545:[GC暂停(G1疏散暂停)(年轻),0.0072213秒] [Eden: 0.0B(3072.0M)->0.0B(3072.0M) Survivors: 0.0B->0.0B Heap: 8144.0M(8192.0M)->8144.0M(8192.0M)] [Times: user=0.06 sys=0.00, real=0.01 secs] 2019-01-25T00:25:29.640+0800: 399237.553: [GC pause (G1 Evacuation Pause) (young), 0.0032099 secs] [Eden: 0.0B(3072.0M)->0.0B(3072.0M) Survivors: 0.0B->0.0B Heap: 8144.0M(8192.0M)->8144.0M(8192.0M)] [Times: user=0.02 sys=0.01, real=0.00 secs] 2019-01-25T00:25:29.645+0800: 399237.557: [GC pause (G1 Evacuation Pause) (young), 0.0041076 secs] [Eden: 0.0B(3072.0M)->0.0B(3072.0M) Survivors: 0.0B->0.0B Heap: 8144.0M(8192.0M)->8144.0M(8192.0M)] [Times: user=0.03 sys=0.00, real=0.00 secs] 2019-01-25T00:25:29.649+0800: 399237.562: [GC pause (G1 Evacuation Pause) (young), 0.0027963 secs] [Eden: 0.0B(3072.0M)->0.0B(3072.0M) Survivors: 0.0B->0.0B Heap: 8144.0M(8192.0M)->8144.0M(8192.0M)] [Times: user=0.01 sys=0.00, real=0.01 secs] 2019-01-25T00:25:29.653+0800: 399237.566: [GC pause (G1 Evacuation Pause) (young), 0.0027614 secs] [Eden: 0.0B(3072.0M)->0.0B(3 [伊甸园:0.0B(3072.0M)-> 0.0B(3072.0M)幸存者:0.0B-> 0.0B堆:8144.0M(8192.0M)-> 8144.0M(8192.0M)] [时间:user = 0.06 sys = 0.00,real = 0.01 secs] 2019-01-25T00:25:29.640 + 0800:399237.553:[GC暂停(G1疏散暂停)(年轻),0.0032099 secs] [Eden:0.0B(3072.0M)-> 0.0B( 3072.0M)幸存者:0.0B-> 0.0B堆:8144.0M(8192.0M)-> 8144.0M(8192.0M)] [时间:user = 0.02 sys = 0.01,real = 0.00 secs] 2019-01-25T00:25 :29.645 + 0800:399237.557:[GC暂停(G1疏散暂停)(年轻),0.0041076秒] [Eden:0.0B(3072.0M)-> 0.0B(3072.0M)幸存者:0.0B-> 0.0B堆:8144.0 M(8192.0M)-> 8144.0M(8192.0M)] [时间:用户= 0.03 sys = 0.00,实际= 0.00秒] 2019-01-25T00:25:29.649 + 0800:399237.562:[GC暂停(G1疏散暂停) )(年轻),0.0027963秒] [Eden:0.0B(3072.0M)-> 0.0B(3072.0M)幸存者:0.0B-> 0.0B堆:8144.0M(8192.0M)-> 8144.0M(8192.0M)] [Times:用户= 0.01 sys = 0.00,真实= 0.01秒] 2019-01-25T00:25:29.653 + 0800:399237.566:[GC暂停(G1疏散暂停)(年轻),0.0027614秒] [Eden:0.0B( 3072.0M) - > 00亿(3 072.0M) Survivors: 0.0B->0.0B Heap: 8144.0M(8192.0M)->8144.0M(8192.0M)] [Times: user=0.02 sys=0.00, real=0.00 secs] 2019-01-25T00:25:29.656+0800: 399237.569: [Full GC (Allocation Failure) 8144M->4016M(8192M), 10.6192450 secs] [Eden: 0.0B(3072.0M)->0.0B(3072.0M) Survivors: 0.0B->0.0B Heap: 8144.0M(8192.0M)->4016.1M(8192.0M)], [Metaspace: 83995K->83979K(1126400K)] [Times: user=15.73 sys=0.00, real=10.62 secs] 2019-01-25T00:25:40.277+0800: 399248.190: [GC concurrent-mark-abort] Heap garbage-first heap total 8388608K, used 4268154K [0x00000005c0000000, 0x00000005c0802000, 0x00000007c0000000) region size 8192K, 20 young (163840K), 0 survivors (0K) Metaspace 072.0M)幸存者:0.0B-> 0.0B堆:8144.0M(8192.0M)-> 8144.0M(8192.0M)] [时间:用户= 0.02 sys = 0.00,实际= 0.00秒] 2019-01-25T00:25 :29.656 + 0800:399237.569:[完整GC(分配失败)8144M-> 4016M(8192M),10.6192450秒] [Eden:0.0B(3072.0M)-> 0.0B(3072.0M)幸存者:0.0B-> 0.0B堆:8144.0M(8192.0M)-> 4016.1M(8192.0M)],[元数据:83995K-> 83979K(1126400K)] [时间:用户= 15.73 sys = 0.00,实际= 10.62秒] 2019-01-25T00: 25:40.277 + 0800:399248.190:[GC并发标记中止]堆垃圾优先堆总计8388608K,已使用4268154K [0x00000005c0000000,0x00000005c0802000,0x00000007c0000000)区域大小8192K,20个年轻(163840K),0个幸存者(0K)元空间
used 84034K, capacity 85146K, committed 86732K, reserved 1126400K 二手的84034K,容量85146K,已承诺的86732K,保留的1126400K
class space used 8833K, capacity 9090K, committed 9420K, reserved 1048576K 类空间使用了8833K,容量9090K,已提交9420K,已预留1048576K


Our server is 32G 16 core, any advice or suggestion will be very appreciated! 我们的服务器是32G 16核,任何意见或建议将不胜感激!

It's complicated. 情况很复杂。

In the logs, I found that the doing full GC due to the old generation is too small. 在日志中,我发现由于老一代而进行完整的GC太小了。

"Allocation Failure" showed that direct allocation to old generation failed. “分配失败”表明直接分配给上一代失败。

But, from the logs I also found that the minor GCs before full GC is too frequent and free a little bit space. 但是,从日志中,我还发现,完整GC之前的次要GC过于频繁,并且释放了一些空间。 There was 7 minor GC between 25:29.338 25:29.653。 Only first minor GC freed some space. 25:29.338 25:29.653之间有7个次要GC。只有第一个次要GC释放了一些空间。

The worst problem is “[Eden: 0.0B(3072.0M)->0.0B(3072.0M) Survivors: 0.0B->0.0B”. 最严重的问题是“ [Eden:0.0B(3072.0M)-> 0.0B(3072.0M)幸存者:0.0B-> 0.0B”。 It seems that all objects are allocated in old generation.From the log "[Eden: 0.0B(3072.0M)->0.0B(3072.0M) Survivors: 0.0B->0.0B Heap: 8144.0M(8192.0M)->4016.1M(8192.0M)]", I think that this full GC freed 4000M+ old generation space. 似乎所有对象都是旧的对象。从日志“ [Eden:0.0B(3072.0M)-> 0.0B(3072.0M)幸存者:0.0B-> 0.0B堆:8144.0M(8192.0M)-> 4016.1M(8192.0M)]“,我认为此完整的GC释放了4000M +的旧一代空间。 It's very weird. 很奇怪 Your application needs at least 4G old generation and almost all objects in old generation are reclaimed. 您的应用程序至少需要4G的旧版本,并且几乎回收了所有旧版本的对象。 It means that the old objects is not old enough. 这意味着旧对象还不够旧。 Most of them are promoted prematurely. 他们中的大多数人都过早晋升。 Or much of them are huge objects. 或其中许多是巨大的物体。

I try to give you some advices... 我尝试给你一些建议...

  1. -XX:InitiatingHeapOccupancyPercent=35 is too small. -XX:InitiatingHeapOccupancyPercent = 35太小。 It makes minor GC more frequent and the objects may be promoted prematurely. 这会使次要GC更加频繁,并且对象可能会过早升级。 So the old generation will be full soon. 因此,老一辈将很快吃饱。 You can set XX:InitiatingHeapOccupancyPercent bigger. 您可以将XX:InitiatingHeapOccupancyPercent设置得更大。 Or you can think about adaptive IHOP. 或者您可以考虑自适应IHOP。
  2. The old generation is too small. 老一代太小了。 Some application need much more old generation space then young generation space. 一些应用程序需要比年轻一代空间更多的老一代空间。 Somebody say that double or triple young generation space size is good. 有人说两倍或三倍的年轻一代空间大小是好的。

  3. MaxTenuringThreshold=15 can set to the value that bigger than 20. MaxTenuringThreshold = 15可以设置为大于20的值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM