Java parallel GC: unnecessary Full GC

I have a service that reads data from a source, performs some conversion to the data and then uploads the converted to a destination. When picking GC algorithm, I am looking for the one with high throughput and that was why I picked parallel GC. The part that confused me a lot is why I saw a good amount of Full GC. The nature of the service makes most of the objects short living as data comes and goes. Here is my GC config:

-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -verbose:GC -XX:+UseParallelGC -XX:NewSize=21200m -XX:MaxNewSize=21200m -server -Xms31200m -Xmx31200m

Basically, I set the total heap size to 30GB and the young gen size to 20GB.

Here is a piece of the GC log:

2020-11-08T08:31:07.863+0000: 215.876: [Full GC (Ergonomics) [PSYoungGen: 1233347K->0K(18729472K)] [ParOldGen: 9065862K->6633660K(10240000K)] 10299209K->6633660K(28969472K), [Metaspace: 107588K->107588K(1144832K)], 1.1350824 secs] [Times: user=21.03 sys=0.00, real=1.14 secs]
2020-11-08T08:31:10.627+0000: 218.640: [GC (GCLocker Initiated GC)
Desired survivor size 2699034624 bytes, new threshold 1 (max 15)
[PSYoungGen: 15874560K->1274938K(19073024K)] 22513996K->7914375K(29313024K), 0.1073842 secs] [Times: user=3.10 sys=0.00, real=0.11 secs]
2020-11-08T08:31:12.319+0000: 220.331: [GC (GCLocker Initiated GC)
Desired survivor size 2587885568 bytes, new threshold 1 (max 15)
[PSYoungGen: 17602106K->1307000K(18962944K)] 24253865K->8618788K(29202944K), 0.2492961 secs] [Times: user=7.16 sys=0.00, real=0.25 secs]
2020-11-08T08:31:14.197+0000: 222.210: [GC (GCLocker Initiated GC)
Desired survivor size 2480930816 bytes, new threshold 1 (max 15)
[PSYoungGen: 17634168K->1333816K(19286016K)] 24952891K->9297010K(29526016K), 0.2524904 secs] [Times: user=7.07 sys=0.00, real=0.25 secs]
2020-11-08T08:31:16.165+0000: 224.178: [GC (GCLocker Initiated GC)
Desired survivor size 2386558976 bytes, new threshold 1 (max 15)
[PSYoungGen: 18092600K->1313137K(19181568K)] 26062932K->9992006K(29421568K), 0.2845171 secs] [Times: user=7.85 sys=0.00, real=0.29 secs]
2020-11-08T08:31:18.084+0000: 226.096: [GC (GCLocker Initiated GC)
Desired survivor size 2312110080 bytes, new threshold 1 (max 15)
[PSYoungGen: 18071921K->1242981K(19450880K)] 26751020K->10584632K(29690880K), 0.2523254 secs] [Times: user=6.79 sys=0.00, real=0.26 secs]
2020-11-08T08:31:18.336+0000: 226.349: [Full GC (Ergonomics) [PSYoungGen: 1242981K->0K(19450880K)] [ParOldGen: 9341651K->6896991K(10240000K)] 10584632K->6896991K(29690880K), [Metaspace: 107625K->107625K(1144832K)], 1.0198299 secs] [Times: user=18.34 sys=0.08, real=1.02 secs]
2020-11-08T08:31:21.049+0000: 229.062: [GC (GCLocker Initiated GC)
Desired survivor size 2221408256 bytes, new threshold 1 (max 15)
[PSYoungGen: 17120256K->1356565K(19378176K)] 24043241K->8279559K(29618176K), 0.1089915 secs] [Times: user=3.38 sys=0.00, real=0.11 secs]
2020-11-08T08:31:22.887+0000: 230.899: [GC (GCLocker Initiated GC)
Desired survivor size 2155872256 bytes, new threshold 1 (max 15)
[PSYoungGen: 18476821K->1265473K(19603456K)] 25426058K->8896652K(29843456K), 0.2524566 secs] [Times: user=7.14 sys=0.00, real=0.25 secs]
2020-11-08T08:31:24.888+0000: 232.901: [GC (GCLocker Initiated GC)
Desired survivor size 2092433408 bytes, new threshold 1 (max 15)
[PSYoungGen: 18699585K->1388375K(19539456K)] 26345045K->9562491K(29779456K), 0.2113546 secs] [Times: user=5.59 sys=0.00, real=0.21 secs]
2020-11-08T08:31:26.819+0000: 234.832: [GC (GCLocker Initiated GC)
Desired survivor size 2030043136 bytes, new threshold 1 (max 15)
[PSYoungGen: 18822487K->1308016K(19726336K)] 27003840K->10002863K(29966336K), 0.2078162 secs] [Times: user=6.10 sys=0.00, real=0.21 secs]
2020-11-08T08:31:28.868+0000: 236.881: [GC (GCLocker Initiated GC)
Desired survivor size 2030043136 bytes, new threshold 1 (max 15)
[PSYoungGen: 18990960K->1521040K(19665408K)] 27712283K->10780549K(29905408K), 0.2373748 secs] [Times: user=6.60 sys=0.00, real=0.23 secs]
2020-11-08T08:31:29.106+0000: 237.119: [Full GC (Ergonomics) [PSYoungGen: 1521040K->0K(19665408K)] [ParOldGen: 9259509K->7378423K(10240000K)] 10780549K->7378423K(29905408K), [Metaspace: 107653K->107653K(1144832K)], 1.0809680 secs] [Times: user=20.55 sys=0.00, real=1.09 secs]

There are a couple of things from the log that really confused me:

  1. How does JVM figure out the Desired survivor size ? Why it was around 2.5 GB? Why it changed a little bit in each soft GC? Why did total old gen size never changed (10240000K) but the total young gen size changed all the time?
  2. Why the * new threshold was always 1? Wasn't this just way too aggressive to move things into old gen?
  3. After each soft GC, the young gen most likely had around 1.3GB of data and some amount of the data were moved to old gen. That caused old gen to gradually get full and Full GC eventually happened to cleanup old gen. Why a portion of data were moved to old gen in each soft GC? Looks like the survivor space is big enough.
  4. What I can do to avoid unnecessary Full GC so that the overall throughput can be improved?

I'll just address point 4 as the answer from Sachith deals with the first 3. You have selected a GC that will not perform old gen (or full) gc until it's really needed. Full gc is the most expensive and the cpu cycles are dedicated to your work instead. Using concurrent gc:s will remove some or all of the full gc:s but you will lose cpu cycles instead. So, there are no guarantees that a concurrent gc will actually be faster. Also, from your flags -Xms31200m -Xmx31200m . You are setting the min and max sizes of the heap to the same and that means that the VM will not perform any ergonomics (adaptions) on the heap. Depending on how important the performance of your app is and you have a decent test env, I would suggest testing different gc:s and see what kind of performance you get. I would also use the factory setting for everything except max heap and see how far that gets you.

Well, answering your number of questions cannot be done with simple explanations;

  1. JVM uses -XX:SurvivorRatio parameter to define survivor generation size. Default value for this is -XX:SurvivorRatio=8 . This is a ratio and this mean survivor space is one-eighth of the size of the Eden space. For your case this gives your survivor space size - 1/8 * 20GB . According to this document this often not important for performance. Since you're setting fixed large size for the young generation, old generation stay the same. Using -XX:+UseAdaptiveSizePolicy for ParallelGC may help to adapt the size around young/old boundary. Also, larger the young generation, less often GC minor collections happens. Seems these minor collections would be the case you see survival space slightly shrink and grow.

  2. threshold has chosen by the JVM for ParallelGC. As per this article ,

If survivor spaces are too small, copying collection overflows directly into the tenured generation. If survivor spaces are too large, they will be uselessly empty. At each garbage collection, the virtual machine chooses a threshold number, which is the number of times an object can be copied before it is tenured. This threshold is chosen to keep the survivors half full.

This seems like an aggressive behavior. But the minor collection cycles are significantly apart and it seems, if required threshold is allowed to change up to 15 as well.

  1. If some objects lives desired amount of garbage collection cycles in the young generation, as ParallelGC design, they are destined to move to old generation. You cannot guarantee, how large the young generation is, long survived objects stays in the young generation forever. Young generation is for rapidly allocating and deallocating objects, not for long living objects. So, as you observed, eventually old generation gets filled and clean up.

  2. Assuming you're using Java 8 or later version, to improve the throughput of the program, I would say, use the G1GC instead of ParallelGC. Since your heap is significantly large, G1GC would be the ideal choice. G1GC algorithm designed to execute on very large terra byte (TB) heap spaces with minimal pause time. G1GC recommend to use on heaps larger than 6GB ( Garbage First Garbage Collector Tuning ). When using the G1GC, -XX:+UseStringDeduplication would be a great help if your program work with large String objects. This GC partition the entire heap space into multiple small regions and execute the collection process using parallel and concurrent threads.

There are two other experimental GCs also there ( ZGC and Shenandoah ) which released with Java 11 and Java 12 respectively. These GCs significantly reduce the pause time with more garbage collection.

Update: ZGC and Shenandoah stable releases are available with Java 15 released on September 2020.

