简体   繁体   English

Java 并行 GC:不必要的 Full GC

[英]Java parallel GC: unnecessary Full GC

I have a service that reads data from a source, performs some conversion to the data and then uploads the converted to a destination.我有一个从源读取数据的服务,对数据执行一些转换,然后将转换后的数据上传到目的地。 When picking GC algorithm, I am looking for the one with high throughput and that was why I picked parallel GC.在选择 GC 算法时,我正在寻找具有高吞吐量的算法,这就是我选择并行 GC 的原因。 The part that confused me a lot is why I saw a good amount of Full GC.让我很困惑的部分是为什么我看到了大量的 Full GC。 The nature of the service makes most of the objects short living as data comes and goes.服务的性质使得大多数对象随着数据的来来去去而短暂存在。 Here is my GC config:这是我的 GC 配置:

-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -verbose:GC -XX:+UseParallelGC -XX:NewSize=21200m -XX:MaxNewSize=21200m -server -Xms31200m -Xmx31200m

Basically, I set the total heap size to 30GB and the young gen size to 20GB.基本上,我将总堆大小设置为 30GB,将年轻代大小设置为 20GB。

Here is a piece of the GC log:这是 GC 日志的一部分:

2020-11-08T08:31:07.863+0000: 215.876: [Full GC (Ergonomics) [PSYoungGen: 1233347K->0K(18729472K)] [ParOldGen: 9065862K->6633660K(10240000K)] 10299209K->6633660K(28969472K), [Metaspace: 107588K->107588K(1144832K)], 1.1350824 secs] [Times: user=21.03 sys=0.00, real=1.14 secs]
2020-11-08T08:31:10.627+0000: 218.640: [GC (GCLocker Initiated GC)
Desired survivor size 2699034624 bytes, new threshold 1 (max 15)
[PSYoungGen: 15874560K->1274938K(19073024K)] 22513996K->7914375K(29313024K), 0.1073842 secs] [Times: user=3.10 sys=0.00, real=0.11 secs]
2020-11-08T08:31:12.319+0000: 220.331: [GC (GCLocker Initiated GC)
Desired survivor size 2587885568 bytes, new threshold 1 (max 15)
[PSYoungGen: 17602106K->1307000K(18962944K)] 24253865K->8618788K(29202944K), 0.2492961 secs] [Times: user=7.16 sys=0.00, real=0.25 secs]
2020-11-08T08:31:14.197+0000: 222.210: [GC (GCLocker Initiated GC)
Desired survivor size 2480930816 bytes, new threshold 1 (max 15)
[PSYoungGen: 17634168K->1333816K(19286016K)] 24952891K->9297010K(29526016K), 0.2524904 secs] [Times: user=7.07 sys=0.00, real=0.25 secs]
2020-11-08T08:31:16.165+0000: 224.178: [GC (GCLocker Initiated GC)
Desired survivor size 2386558976 bytes, new threshold 1 (max 15)
[PSYoungGen: 18092600K->1313137K(19181568K)] 26062932K->9992006K(29421568K), 0.2845171 secs] [Times: user=7.85 sys=0.00, real=0.29 secs]
2020-11-08T08:31:18.084+0000: 226.096: [GC (GCLocker Initiated GC)
Desired survivor size 2312110080 bytes, new threshold 1 (max 15)
[PSYoungGen: 18071921K->1242981K(19450880K)] 26751020K->10584632K(29690880K), 0.2523254 secs] [Times: user=6.79 sys=0.00, real=0.26 secs]
2020-11-08T08:31:18.336+0000: 226.349: [Full GC (Ergonomics) [PSYoungGen: 1242981K->0K(19450880K)] [ParOldGen: 9341651K->6896991K(10240000K)] 10584632K->6896991K(29690880K), [Metaspace: 107625K->107625K(1144832K)], 1.0198299 secs] [Times: user=18.34 sys=0.08, real=1.02 secs]
2020-11-08T08:31:21.049+0000: 229.062: [GC (GCLocker Initiated GC)
Desired survivor size 2221408256 bytes, new threshold 1 (max 15)
[PSYoungGen: 17120256K->1356565K(19378176K)] 24043241K->8279559K(29618176K), 0.1089915 secs] [Times: user=3.38 sys=0.00, real=0.11 secs]
2020-11-08T08:31:22.887+0000: 230.899: [GC (GCLocker Initiated GC)
Desired survivor size 2155872256 bytes, new threshold 1 (max 15)
[PSYoungGen: 18476821K->1265473K(19603456K)] 25426058K->8896652K(29843456K), 0.2524566 secs] [Times: user=7.14 sys=0.00, real=0.25 secs]
2020-11-08T08:31:24.888+0000: 232.901: [GC (GCLocker Initiated GC)
Desired survivor size 2092433408 bytes, new threshold 1 (max 15)
[PSYoungGen: 18699585K->1388375K(19539456K)] 26345045K->9562491K(29779456K), 0.2113546 secs] [Times: user=5.59 sys=0.00, real=0.21 secs]
2020-11-08T08:31:26.819+0000: 234.832: [GC (GCLocker Initiated GC)
Desired survivor size 2030043136 bytes, new threshold 1 (max 15)
[PSYoungGen: 18822487K->1308016K(19726336K)] 27003840K->10002863K(29966336K), 0.2078162 secs] [Times: user=6.10 sys=0.00, real=0.21 secs]
2020-11-08T08:31:28.868+0000: 236.881: [GC (GCLocker Initiated GC)
Desired survivor size 2030043136 bytes, new threshold 1 (max 15)
[PSYoungGen: 18990960K->1521040K(19665408K)] 27712283K->10780549K(29905408K), 0.2373748 secs] [Times: user=6.60 sys=0.00, real=0.23 secs]
2020-11-08T08:31:29.106+0000: 237.119: [Full GC (Ergonomics) [PSYoungGen: 1521040K->0K(19665408K)] [ParOldGen: 9259509K->7378423K(10240000K)] 10780549K->7378423K(29905408K), [Metaspace: 107653K->107653K(1144832K)], 1.0809680 secs] [Times: user=20.55 sys=0.00, real=1.09 secs]

There are a couple of things from the log that really confused me:日志中有几件事让我很困惑:

  1. How does JVM figure out the Desired survivor size ? JVM 如何计算Desiredsurvivor size Why it was around 2.5 GB?为什么大约是 2.5 GB? Why it changed a little bit in each soft GC?为什么它在每次软 GC 中都会发生一点变化? Why did total old gen size never changed (10240000K) but the total young gen size changed all the time?为什么老一代总大小从未改变(10240000K),但年轻代总大小一直在变化?
  2. Why the * new threshold was always 1?为什么 *新阈值总是 1? Wasn't this just way too aggressive to move things into old gen?这不是太激进而无法将事情转移到旧世代吗?
  3. After each soft GC, the young gen most likely had around 1.3GB of data and some amount of the data were moved to old gen.在每次软 GC 之后,年轻代最有可能拥有大约 1.3GB 的数据,并且一些数据被移动到年老代。 That caused old gen to gradually get full and Full GC eventually happened to cleanup old gen.这导致 old gen 逐渐变满,Full GC 最终碰巧清理了 old gen。 Why a portion of data were moved to old gen in each soft GC?为什么每次软 GC 都会将一部分数据移动到 old gen? Looks like the survivor space is big enough.看来幸存者空间够大了。
  4. What I can do to avoid unnecessary Full GC so that the overall throughput can be improved?我可以做些什么来避免不必要的 Full GC,从而提高整体吞吐量?

I'll just address point 4 as the answer from Sachith deals with the first 3. You have selected a GC that will not perform old gen (or full) gc until it's really needed.我将只解决第 4 点,因为 Sachith 的答案涉及前 3 点。您选择了一个 GC,它在真正需要之前不会执行 old gen(或完整)gc。 Full gc is the most expensive and the cpu cycles are dedicated to your work instead.完整的 gc 是最昂贵的,而 cpu 周期专用于您的工作。 Using concurrent gc:s will remove some or all of the full gc:s but you will lose cpu cycles instead.使用并发 gc:s 将删除部分或全部完整的 gc:s,但您将失去 CPU 周期。 So, there are no guarantees that a concurrent gc will actually be faster.因此,不能保证并发 gc 实际上会更快。 Also, from your flags -Xms31200m -Xmx31200m .此外,从您的标志-Xms31200m -Xmx31200m You are setting the min and max sizes of the heap to the same and that means that the VM will not perform any ergonomics (adaptions) on the heap.您将堆的最小和最大大小设置为相同,这意味着 VM 不会在堆上执行任何人体工程学(适应)。 Depending on how important the performance of your app is and you have a decent test env, I would suggest testing different gc:s and see what kind of performance you get.根据您的应用程序性能的重要性以及您有一个不错的测试环境,我建议测试不同的 gc:s 并查看您获得的性能。 I would also use the factory setting for everything except max heap and see how far that gets you.除了最大堆之外,我还会对所有内容使用出厂设置,看看这能让你走多远。

Well, answering your number of questions cannot be done with simple explanations;嗯,回答您的问题数量不能通过简单的解释来完成;

  1. JVM uses -XX:SurvivorRatio parameter to define survivor generation size. JVM 使用-XX:SurvivorRatio参数来定义幸存者生成大小。 Default value for this is -XX:SurvivorRatio=8 .默认值为-XX:SurvivorRatio=8 This is a ratio and this mean survivor space is one-eighth of the size of the Eden space.这是一个比率,这意味着幸存者空间是伊甸园空间大小的八分之一。 For your case this gives your survivor space size - 1/8 * 20GB .对于您的情况,这会给您的幸存者空间大小 - 1/8 * 20GB According to this document this often not important for performance.根据文档,这对于性能通常并不重要。 Since you're setting fixed large size for the young generation, old generation stay the same.由于您为年轻代设置了固定的大尺寸,因此老年代保持不变。 Using -XX:+UseAdaptiveSizePolicy for ParallelGC may help to adapt the size around young/old boundary.对 ParallelGC 使用-XX:+UseAdaptiveSizePolicy可能有助于调整新/旧边界周围的大小。 Also, larger the young generation, less often GC minor collections happens.此外,年轻代越大,GC 次要收集发生的频率就越低。 Seems these minor collections would be the case you see survival space slightly shrink and grow.似乎这些小收藏品会是您看到生存空间略微缩小和增长的情况。

  2. threshold has chosen by the JVM for ParallelGC. JVM 为 ParallelGC 选择了threshold As per this article ,根据这篇文章

If survivor spaces are too small, copying collection overflows directly into the tenured generation.如果幸存者空间太小,复制集合会直接溢出到年老代。 If survivor spaces are too large, they will be uselessly empty.如果幸存者空间太大,它们将是无用的空。 At each garbage collection, the virtual machine chooses a threshold number, which is the number of times an object can be copied before it is tenured.在每次垃圾回收时,虚拟机都会选择一个阈值数,即对象在使用前可以被复制的次数。 This threshold is chosen to keep the survivors half full.选择这个阈值是为了让幸存者保持半满。

This seems like an aggressive behavior.这似乎是一种攻击性行为。 But the minor collection cycles are significantly apart and it seems, if required threshold is allowed to change up to 15 as well.但是次要收集周期明显不同,似乎如果需要,阈值也可以更改为 15。

  1. If some objects lives desired amount of garbage collection cycles in the young generation, as ParallelGC design, they are destined to move to old generation.如果某些对象在年轻代中存活了所需数量的垃圾回收周期,则作为 ParallelGC 设计,它们注定要移动到年老代。 You cannot guarantee, how large the young generation is, long survived objects stays in the young generation forever.你不能保证年轻代有多大,长期存活的对象永远留在年轻代。 Young generation is for rapidly allocating and deallocating objects, not for long living objects.年轻代用于快速分配和释放对象,而不是用于长寿命对象。 So, as you observed, eventually old generation gets filled and clean up.因此,正如您所观察到的,最终老年代会被填满并清理干净。

  2. Assuming you're using Java 8 or later version, to improve the throughput of the program, I would say, use the G1GC instead of ParallelGC.假设您使用的是 Java 8 或更高版本,为了提高程序的吞吐量,我会说,使用G1GC而不是 ParallelGC。 Since your heap is significantly large, G1GC would be the ideal choice.由于您的堆非常大,G1GC 将是理想的选择。 G1GC algorithm designed to execute on very large terra byte (TB) heap spaces with minimal pause time. G1GC 算法旨在以最少的暂停时间在非常大的 terra 字节 (TB) 堆空间上执行。 G1GC recommend to use on heaps larger than 6GB ( Garbage First Garbage Collector Tuning ). G1GC 建议在大于 6GB 的堆上使用( 垃圾优先垃圾收集器调优)。 When using the G1GC, -XX:+UseStringDeduplication would be a great help if your program work with large String objects.使用 G1GC 时, -XX:+UseStringDeduplication如果您的程序使用大型String对象,将会有很大帮助。 This GC partition the entire heap space into multiple small regions and execute the collection process using parallel and concurrent threads.这个 GC 将整个堆空间划分为多个小区域,并使用并行和并发线程执行收集过程。

There are two other experimental GCs also there ( ZGC and Shenandoah ) which released with Java 11 and Java 12 respectively.还有另外两个实验性 GC( ZGCShenandoah ),它们分别随 Java 11 和 Java 12 一起发布。 These GCs significantly reduce the pause time with more garbage collection.这些 GC 通过更多的垃圾收集显着减少了暂停时间。

Update: ZGC and Shenandoah stable releases are available with Java 15 released on September 2020.更新: ZGC 和 Shenandoah 稳定版本可用于 2020 年 9 月发布的 Java 15。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM