如何解释导致OutOfMemoryError的G1 GC日志？

Question

I was wondering if someone is able to explain to me how to interpret some G1 GC logs that lead up to an OutOfMemoryError? 我想知道是否有人能够向我解释如何解释一些导致OutOfMemoryError的G1 GC日志？

I know that a heap dump is the best bet for finding out what is actually using the heap but I can't get that since it contains protected information that cannot leave the client site. 我知道堆转储是查找实际使用堆的最佳选择，但我无法得到它，因为它包含无法离开客户端站点的受保护信息。 All I have are the application logs (which include the stack from the OOME) and the G1 GC logs. 我所拥有的只是应用程序日志（包括来自OOME的堆栈）和G1 GC日志。

The full G1 GC logs have a lot of detail so I won't put them here unless someone specifically needs to see them. 完整的G1 GC日志有很多细节，所以我不会把它们放在这里，除非有人特别需要看到它们。

The specific Java version that these came from was: 这些来自的特定Java版本是：

> java -version
java version "1.7.0_21"
Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)

And the GC options I am using to create the GC log are: 我用来创建GC日志的GC选项是：

-XX:+PrintGCTimeStamps
-XX:+PrintGCDetails
-Xloggc:log/gc.log

Below are all the memory stats from each young and full GC during the last 30 minutes leading up to the OOME: 以下是在OOME前30分钟内每个年轻和完整GC的所有内存统计数据：

INFO   | jvm 1    | 2015/05/28 04:29:34 | [Eden: 1290M(1290M)->0B(1290M) Survivors: 20M->20M Heap: 2445M(3932M)->1155M(3932M)]
INFO   | jvm 1    | 2015/05/28 04:33:21 | [Eden: 1290M(1290M)->0B(1290M) Survivors: 20M->20M Heap: 2445M(3932M)->1155M(3932M)]
INFO   | jvm 1    | 2015/05/28 04:37:09 | [Eden: 1290M(1290M)->0B(1290M) Survivors: 20M->20M Heap: 2445M(3932M)->1155M(3932M)]
INFO   | jvm 1    | 2015/05/28 04:40:58 | [Eden: 1290M(1290M)->0B(1290M) Survivors: 20M->20M Heap: 2445M(3932M)->1155M(3932M)]
INFO   | jvm 1    | 2015/05/28 04:44:44 | [Eden: 1290M(1290M)->0B(1290M) Survivors: 20M->20M Heap: 2445M(3932M)->1155M(3932M)]
INFO   | jvm 1    | 2015/05/28 04:48:30 | [Eden: 1290M(1290M)->0B(1290M) Survivors: 20M->20M Heap: 2445M(3932M)->1155M(3932M)]
INFO   | jvm 1    | 2015/05/28 04:52:17 | [Eden: 1290M(1290M)->0B(1290M) Survivors: 20M->20M Heap: 2445M(3932M)->1155M(3932M)]
INFO   | jvm 1    | 2015/05/28 04:52:58 | [Eden: 639M(1290M)->0B(1295M) Survivors: 20M->15M Heap: 2278M(3932M)->1635M(3932M)]
INFO   | jvm 1    | 2015/05/28 04:52:59 | [Eden: 51M(1295M)->0B(1300M) Survivors: 15M->10M Heap: 2561M(3932M)->2505M(3932M)]
INFO   | jvm 1    | 2015/05/28 04:52:59 | [Full GC 2505M->1170M(3901M), 1.9469560 secs]
INFO   | jvm 1    | 2015/05/28 04:53:01 | [Eden: 44M(1300M)->0B(1299M) Survivors: 0B->1024K Heap: 1653M(3901M)->1610M(3901M)]
INFO   | jvm 1    | 2015/05/28 04:53:01 | [Eden: 1024K(1299M)->0B(1299M) Survivors: 1024K->1024K Heap: 1610M(3901M)->1610M(3901M)]
INFO   | jvm 1    | 2015/05/28 04:53:02 | [Full GC 1610M->1158M(3891M), 1.4317370 secs]
INFO   | jvm 1    | 2015/05/28 04:53:03 | [Eden: 112M(1299M)->0B(1296M) Survivors: 0B->1024K Heap: 1758M(3891M)->1647M(3891M)]
INFO   | jvm 1    | 2015/05/28 04:53:06 | [Eden: 49M(1296M)->0B(1360M) Survivors: 1024K->1024K Heap: 2776M(4084M)->2728M(4084M)]
INFO   | jvm 1    | 2015/05/28 04:53:06 | [Eden: 0B(1360M)->0B(1360M) Survivors: 1024K->1024K Heap: 2837M(4084M)->2836M(4084M)]
INFO   | jvm 1    | 2015/05/28 04:53:06 | [Full GC 2836M->1158M(3891M), 1.4847750 secs]
INFO   | jvm 1    | 2015/05/28 04:53:08 | [Full GC 1158M->1158M(3891M), 1.5313770 secs]

* This is a different format to the raw logs and I removed the timing details to make it shorter and easier to read. *这是原始日志的不同格式，我删除了时间细节，使其更短更容易阅读。

I also graphed the raw GC logs these in GCViewer: 我还在GCViewer中绘制了原始GC日志：

在此输入图像描述

It seems like everything was going ok so far: 到目前为止，似乎一切顺利：

The used tenured was constant (the dark magenta line at the bottom of the graph). 使用的tenured是恒定的（图表底部的深品红线）。
Young GC pauses occurred every couple of minutes cleaning up all the young objects (the grey line at the top of the graph). 每隔几分钟就会发生年轻的GC暂停，清理所有年轻物体（图表顶部的灰线）。
Allocated sizes for each generation were constant. 每一代的分配大小是不变的。
Heap usage after the young GCs was around 1155M. 年轻的GC之后的堆使用量约为1155M。

Then at 2015/05/28 04:52:59 things went pear shaped: 然后在2015/05/28 04:52:59事情变成了梨形：

The tenured all of a sudden started growing. 终身突然开始成长。
When the young GC ran there were only 51M in the eden space. 当年轻的GC运行时，伊甸园空间只有51M。
Full GCs started occurring. 完整的GC开始发生。
The first 3 full GCs seemed ok, they reduced the heap usage to 1158M-1170M (which is very close to the normal 1155M). 前3个完整的GC似乎没问题，他们将堆使用量减少到1158M-1170M（非常接近正常的1155M）。
The final full GC started with 1158M used and was still 1158M after. 最终完整的GC开始使用1158M，之后仍然是1158M。

The Memory tab in the screenshot shows: 屏幕截图中的Memory选项卡显示：

Tenured heap (usage / alloc. max)    2,836 (104.1%) / 2,723M
Total promotion                                       2,048K

Now to explain briefly what occurred at 2015/05/28 04:52:59 . 现在简要解释2015/05/28 04:52:59发生的2015/05/28 04:52:59 。 At this point a whole bunch of configuration objects were serialised to a custom format using a StringBuilder. 此时，使用StringBuilder将一大堆配置对象序列化为自定义格式。 This resulted in a bunch of array copies which eventually resulted in the following exception at 2015/05/28 04:53:09 : 这导致了一堆数组副本，最终在2015/05/28 04:53:09导致以下异常：

java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:2367)
    at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
    at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
    at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:587)
    at java.lang.StringBuilder.append(StringBuilder.java:214)
    ...

There are a few things I can't explain: 我无法解释一些事情：

Where in the GC logs would you find the used tenured memory? 在GC日志中，您会找到使用过的终身内存吗？
What would cause such a significant spike in the used tenured memory to cause a GC? 什么会导致使用的终身内存中出现如此显着的峰值导致GC？ There were only 20M of survivors so wouldn't in the worst case these all go to tenured and no more? 只有2000万的幸存者，所以在最坏的情况下，这些都不会终身而已？
Could this perhaps be explained by a humongous object allocation? 这可能是通过大量的对象分配来解释的吗？
Why was the last full GC triggered when there was (apparently) so little used heap and at afterwards it cleaned up nothing? 为什么最后一个完整的GC被触发时（显然）有那么少的堆使用，之后它什么都没清理？
If there was 3891M heap allocated and only 1158M used then why would there be an OOME? 如果分配了3891M堆并且只使用了1158M，那么为什么会有OOME呢？

Answer 1

Your out of memory happens during StringBuilder.append - bear in mind that every time you append a string and the buffer inside the StringBuilder is too small such that it expands the capacity, it will try to allocate dobule the buffer by double the current length of the String in the builder plus 2 or the new length if bigger . 你的内存不足发生在StringBuilder.append期间 - 请记住，每次你追加一个字符串并且StringBuilder中的缓冲区太小以至于扩展容量时，它会尝试将缓冲区分配为当前长度的两倍。构建器中的字符串加2或新的长度（如果更大）。 (See source code for AbstractStringBuilder.java ) （参见AbstractStringBuilder.java的源代码）

For example, if your string builder already has 100 characters and is full, then you append another 10 characters to it, it will expand by: 例如，如果您的字符串构建器已经有100个字符并且已满，那么您将另外添加10个字符，它将扩展为：

100*2+2 = 202, which is more than 10. 100 * 2 + 2 = 202，超过10。

So if you already have a really long string (10MB), it will attempt to create a 20MB buffer and so on. 因此，如果您已经拥有一个非常长的字符串（10MB），它将尝试创建一个20MB的缓冲区，依此类推。

Check you code and make sure you are not creating huge strings in the builder. 检查代码并确保您没有在构建器中创建大字符串。

如何解释导致OutOfMemoryError的G1 GC日志？

问题描述

1 个解决方案

解决方案1
1 2015-11-20 07:12:56

如何解释导致OutOfMemoryError的G1 GC日志？

问题描述

1 个解决方案

解决方案1 1 2015-11-20 07:12:56

解决方案1
1 2015-11-20 07:12:56