简体繁体 English

如何针对长生命对象优化Java gc

[英]How to optimize Java gc for long living objects

原文 2013-10-07 18:44:06 6 2 java/ caching/ optimization/ garbage-collection

My java app maintains an internal cache that could grow up to 10 gigs. 我的Java应用程序维护一个内部缓存，可以增长到10演出。 Expiration policy is set to 30 minutes or when memory threshold is reached (I'm using local ehcache). 到期策略设置为30分钟或达到内存阈值时（我正在使用本地ehcache）。 It is obvious that after 30 minutes all cached object will be in the old gen and it will require a full gc to collect them. 很明显，在30分钟之后，所有缓存的对象都将在旧版本中，并且需要一个完整的gc来收集它们。 As for now stop-world pause could reach 6 seconds and I'd like to reduce it. 至于现在停止世界暂停可能达到6秒，我想减少它。

Average object size is 500k but could go up to 1 meg, so we are talking about 10000-20000 cached objects (actually byte arrays). 平均对象大小为500k，但最高可达1mcg，因此我们讨论的是10000-20000个缓存对象（实际上是字节数组）。

What is the best strategy for GC optimisation? GC优化的最佳策略是什么？ I know that I can got off-heap, but it is kind a last resort solution. 我知道我可以脱离堆，但它是最后的解决方案。

Thank you! 谢谢！

2 个解决方案

10GB cache is not something you should do in the heap. 10GB缓存不是你应该在堆中做的事情。 Use ByteBuffers for caching. 使用ByteBuffers进行缓存。 Object creation should not be that costly. 对象创建不应该那么昂贵。 This way there is no GC involved and you can manage everything by yourself. 这种方式不涉及GC，您可以自己管理所有内容。

For example if you implement a page cache in a Java Database Management System you would not create objects for it but use byte buffers or managed byte buffers or best direct byte buffers. 例如，如果在Java数据库管理系统中实现页面缓存，则不会为其创建对象，而是使用字节缓冲区或托管字节缓冲区或最佳直接字节缓冲区。 You can learn more about those three here . 你可以在这里了解更多关于这三个。

If you handle more then lets say a million objects at a time you will see the GC time share going up. 如果您处理更多，那么让我们一次说出一百万个对象，您会看到GC时间份额上升。 I saw situations where we managed a huge number of nodes for data processing and it was really slow. 我看到我们管理大量节点进行数据处理的情况，而且非常慢。 We then switched to a direct byte buffer scheme and used even some additional technics we were able to fit more data in (objects cost 24bytes at least each) and stopped thinking about objects in first place. 然后，我们切换到直接字节缓冲区方案，甚至使用了一些额外的技术，我们能够容纳更多的数据（对象至少每个成本为24字节）并且不再考虑对象。 In the end we handled datas and not objects. 最后我们处理数据而不是对象。 This increased the performance by many times and we ware able to handle much more data then we expected. 这使性能提高了很多倍，我们能够处理比预期更多的数据。

After that we noticed it all fits a database and well that was the point we scraped everything. 在那之后我们注意到这一切都适合数据库，这就是我们抓住一切的重点。

So check out what direct buffers can do for you. 因此，请查看直接缓冲区可以为您做什么。

I routinely working with caching services holding 10-30 GiB of data in JVM heap. 我经常使用缓存服务，在JVM堆中保存10-30 GiB的数据。 Concurent Mark Sweep (GC) algorithm can handle these cases pretty well, keeping max Stop-the-World pause around 100ms (though, absolute numbers depends on hardware). Concurent Mark Sweep（GC）算法可以很好地处理这些情况，保持最大停止世界暂停约100ms（但绝对数量取决于硬件）。

You can find GC tuning check list for caching applications and heap sizing in my blog. 您可以在我的博客中找到用于缓存应用程序和堆大小调整的GC调整清单。

Here you can find more about Concurent Mark Sweep algorithm itself. 在这里您可以找到有关Concurent Mark Sweep算法本身的更多信息。