简体繁体 English

敏感的Xmx / GC默认为小堆的微服务

[英]Sensible Xmx/GC defaults for a microservice with a small heap

原文 2016-04-17 12:59:44 8 2 java/ garbage-collection/ jvm

At my company we are trying an approach with JVM based microservices. 在我的公司，我们正在尝试使用基于JVM的微服务。 They are designed to be scaled horizontally and so we run multiple instances of each using rather small containers (up to 2G heap, usually 1-1.5G). 它们被设计为水平扩展，因此我们使用相当小的容器（最多2G堆，通常1-1.5G）运行每个实例的多个实例。 The JVM we use is 1.8.0_40-b25. 我们使用的JVM是1.8.0_40-b25。

Each of such instances typically handles up to 100 RPS with max memory allocation rate around 250 MB/s. 每个这样的实例通常处理高达100 RPS，最大内存分配率约为250 MB / s。

The question is: what kind of GC would be a safe sensible default to start off with? 问题是：什么样的GC可能是一个明智的默认开始？ So far we are using CMS with Xms = Xmx (to avoid pauses during heap resizing) and Xms = Xmx = 1.5G. 到目前为止，我们使用CMS与Xms = Xmx（以避免在堆大小调整期间暂停）和Xms = Xmx = 1.5G。 Results are decent - we hardly ever see any Major GC performed. 结果很不错 - 我们几乎没有看到任何主要的GC执行。

I know that G1 could give me smaller pauses (at the cost of total throughput) but AFAIK it requires a bit more "breathing" space and at least 3-4G heap to perform properly. 我知道G1可以给我更小的停顿（以总吞吐量为代价）但是AFAIK它需要更多的“呼吸”空间并且至少3-4G堆才能正常运行。

Any hints (besides going for Azul's Zing :D) ? 任何提示（除了Azul的Zing：D）？

2 个解决方案

Hint # 1: Do experiments ! 提示＃1： 做实验 ！

Assuming that your microservice is deployed at least on two nodes run one on CMS, another on G1 and see what response times are. 假设您的微服务至少部署在两个节点上，在CMS上运行一个，在G1上运行另一个节点，看看响应时间是多少。

Not very likely, but what if you can find that with G1 performance is so good that need half of original cluster size? 不太可能，但如果你能发现G1性能如此之好，需要原始簇大小的一半呢？

Side notes: 附注：

re: "250Mb/s" -> if all of this is stack memory (alternatively, if it's young gen) then G1 would provide little benefit since collection form these areas is free. re：“250Mb / s” - >如果所有这些都是堆栈内存（或者，如果它是年轻的gen）那么G1将提供很少的好处，因为这些区域的收集是免费的。
re: "100 RPS" -> in many cases on our production we found that reducing concurrent requests in system (either via proxy config, or at application container level) improves throughput. re：“100 RPS” - >在很多情况下，我们发现减少系统中的并发请求（通过代理配置或应用程序容器级别）可以提高吞吐量。 Given small heap it's very likely that you have small cpu number as well (2 to 4). 鉴于小堆，你很可能也有小的CPU数（2到4）。
Additionally there are official Oracle Hints on tuning for a small memory footprint . 此外，还有关于调整内存占用空间的正式Oracle提示。 It might not reflect latest config available on 1.8_40, but it's good read anyway. 它可能无法反映1.8_40上的最新配置，但无论如何它都是很好的阅读。

Measure how much memory is retained after a full GC. 测量完整GC后保留的内存量。 add to this the amount of memory allocated per second and multiply by 2 - 10 depending on how often you would like to have a minor GC. 添加到此每秒分配的内存量，并乘以2 - 10，具体取决于您希望有一个次要GC的频率。 eg every 2 second or every 10 second. 例如，每2秒或每10秒。

Eg say you have up to 500 MB retained after a full GC and GCing every couple of seconds is fine, you can have 500 MB + 2 * 250 MB, or a heap of around 1 GB. 例如，假设您在完整的GC之后保留了最多500 MB，并且每隔几秒就可以使用GC，那么您可以拥有500 MB + 2 * 250 MB或大约1 GB的堆。

The number of RPS is not important. RPS的数量并不重要。