简体   繁体   English

JVM垃圾收集和内存Java数据库

[英]JVM Garbage collection and In Memory Java Databases

We are evaluating some Java based In Memory databases like Hazelcast and VoltDB. 我们正在评估一些基于Java的内存数据库,如Hazelcast和VoltDB。 In case we replicate the data across multiple servers, how possible is that GC for both nodes will hit the servers at same time? 如果我们跨多个服务器复制数据,那么两个节点的GC可能会同时命中服务器吗?

For example we have two nodes with 500 GBs of memory and we know that GC will affect our performance drastically once its kicks in. So what is the probabability that GCs in both nodes will hit together? 例如,我们有两个节点具有500 GB的内存,我们知道GC一旦开始就会大大影响我们的性能。那么两个节点中的GC可能会出现什么样的可能性呢?

To put this another way - is it possible to prevent GCs hitting the two nodes simultaneously by some configurations? 换句话说 - 是否有可能阻止GC通过某些配置同时命中两个节点? We are expecting a throughput of around 15k requests per second so with distribution across 4 or more nodes we can stand hit for one node at a time for 25% performance hit and size accordingly. 我们期望每秒大约15k个请求的吞吐量,因此通过分布在4个或更多节点上,我们可以同时针对一个节点点击25%的性能命中和相应的大小。

If you really want to prevent GC issues, don't use the heap. 如果您确实想要阻止GC问题,请不要使用堆。 That is why we are adding a offheap commercial offering for Hazelcast. 这就是为什么我们要为Hazelcast添加一个offheap商业产品。

On a general level: you get GC issues if you retain objects too long or create objects with a too high frequency that you they are copied to tenure space. 在一般情况下:如果您将对象保留太长时间,或者创建频率太高的对象而将它们复制到任期空间,则会出现GC问题。 So a lot of high speed applications try to prevent creation object litter in the first place. 因此,许多高速应用程序首先尝试防止创建对象垃圾。

I'm currently working on a POC implementation of Hazelcast where object creation is completely removed. 我目前正在开发Hazelcast的POC实现,其中完全删除了对象创建。

There is no way that you can prevent GC kicking-in in different JVMs simultaneously by any configuration. 您无法通过任何配置同时阻止GC在不同JVM中启动。 Having said that, you should look at your application and could fine-tune the GC. 话虽如此,您应该查看您的应用程序并可以微调GC。

As Ben points out VoltDB stores all data off heap. 正如Ben所指出的,VoltDB将所有数据存储在堆中。 The heap is only used for scratch space during transaction routing and stored procedure execution so data for each transaction only lives for a few milliseconds and most never ends up being promoted or live during a GC. 堆仅用于事务路由和存储过程执行期间的临时空间,因此每个事务的数据仅存活几毫秒,并且大多数数据永远不会在GC期间被提升或生存。 Actual SQL execution takes place off heap as well so temp tables don't generate garbage. 实际的SQL执行也是在堆之外执行的,因此临时表不会生成垃圾。

GCs in VoltDB should represent < 1% of execution time. VoltDB中的GC应占执行时间的1%。 You can choose the percentage by sizing the young generation appropriately. 您可以通过适当调整年轻一代的大小来选择百分比。 Real world deployments at that throughput do a young gen GC every handful of seconds and the GCs should only block for single digit milliseconds. 在该吞吐量下的真实世界部署每隔几秒就会执行一次年轻的GC,并且GC应该仅阻止一位数毫秒。 Old gen GCs should be infrequent, on the order of days, and should only block for 10s of milliseconds. 旧的发电机组应该很少,大约几天,并且应该只阻断10毫秒。 You can invoke them manually if you want to make sure they happen during off-peak times. 如果您想确保它们在非高峰时段发生,您可以手动调用它们。

I don't see why concurrent GCs across nodes would matter. 我不明白为什么跨节点的并发GC很重要。 The worst case would be if every node that is a dependency for a transaction does a GC back to back so that latency is the sum of the number of involved nodes. 最糟糕的情况是,如果作为事务依赖关系的每个节点都背靠背地执行GC,那么延迟就是所涉及节点数量的总和。 I suggest you measure and see if it actually impacts throughput for a period of time that matters to you. 我建议您测量并查看它是否实际影响了对您来说重要的一段时间的吞吐量。

We put a lot of effort into latency in the most recent release and I can share one of the KPIs. 我们在最新版本中为延迟付出了很多努力,我可以共享其中一个KPI。

This is a 3 node benchmark of 50/50 read/write of 32 byte keys and 1024 byte values. 这是一个3节点基准,50/50读/写32字节键和1024字节值。 There is a single client with 50 threads. 有一个拥有50个线程的客户端。 There is a node failure during the benchmark and the benchmark runs for 30 minutes. 基准测试期间存在节点故障,基准测试运行30分钟。 This is not a throughput benchmark so there is only one client instance with a smallish number of threads. 这不是吞吐量基准,因此只有一个客户端实例具有少量线程。

Average throughput:               94,114 txns/sec
Average latency:                    0.46 ms
10th percentile latency:            0.26 ms
25th percentile latency:            0.32 ms
50th percentile latency:            0.45 ms
75th percentile latency:            0.54 ms
90th percentile latency:            0.61 ms
95th percentile latency:            0.67 ms
99th percentile latency:            0.83 ms
99.5th percentile latency:          1.44 ms
99.9th percentile latency:          3.65 ms
99.999th percentile latency:       16.00 ms

If you analyze the numbers further and correlate with other events and metrics you find that GC is not a factor even at high percentiles. 如果您进一步分析数字并与其他事件和指标相关联,您会发现即使在高百分位数时GC也不是一个因素。 Hotspot's ParNew collector is very good if you can keep your working set small and avoid promotion, and even when it's bad in terms of latency it's good in terms of throughput. Hotspot的ParNew收集器非常好,如果你可以保持你的工作设置小,避免升级,即使在延迟方面很糟糕,它在吞吐量方面也很好。

Databases that store data on heap do have to be more concerned about GC pauses. 在堆上存储数据的数据库必须更关心GC暂停。 At VoltDB we are only concerned about them because we are frequently evaluated by maximum pause time, not average pause time or pause time at some percentile. 在VoltDB,我们只关心它们,因为我们经常通过最大暂停时间来评估,而不是平均暂停时间或某个百分位数的暂停时间。

Assuming you're running Hazelcast/VoltDB on big(ger) servers with plenty of memory and cores, the Garbage First (G1) garbage collector in new versions of Java could largely ameliorate your concern. 假设您在具有大量内存和内核的大型(ger)服务器上运行Hazelcast / VoltDB,新版Java中的Garbage First(G1)垃圾收集器可以在很大程度上改善您的担忧。

http://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html http://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html

VoltDB stores table data off the heap. VoltDB将表数据存储在堆中。 The memory is allocated by the SQL Execution Engine processes which are written in C++. 内存由SQL执行引擎进程分配,这些进程是用C ++编写的。

The java heap in VoltDB is used for relatively static deployment and schema-related data, and for short-term data as it handles the requests and responses. VoltDB中的Java堆用于相对静态的部署和与模式相关的数据,以及用于处理请求和响应的短期数据。 Even much of that is kept off-heap using direct byte buffers and other structures (read more about that here ). 使用直接字节缓冲区和其他结构(甚至可以在此处阅读更多内容),其中大部分内容都保持在堆外。

For an in-memory DB that maintains consistency like Geode does (ie makes synchronous replication to other nodes before releasing the client thread), your network is going to be a bigger concern than will the hotspot compiler. 对于像Geode一样保持一致性的内存数据库(即在发布客户端线程之前对其他节点进行同步复制),您的网络将比热点编译器更受关注。 Still, here are two points of input to get you to the point where language is irrelevant: 不过,这里有两点输入可以让你达到语言不相关的程度:

1) If you are doing lots of creates/ updates over reads: Use off-heap memory on the server. 1)如果要对读取进行大量创建/更新:使用服务器上的堆外内存。 This minimizes GC's. 这最大限度地减少了GC。

2) Use Geode's serialization mapping between C/C++ and Java objects to avoid JNI. 2)使用Geode在C / C ++和Java对象之间的序列化映射来避免JNI。 Specifically, use the DataSerializer http://gemfire.docs.pivotal.io/geode/developing/data_serialization/gemfire_data_serialization.html If you plan to use queries extensively rather than gets/ puts, use the PDXSerializer: http://gemfire.docs.pivotal.io/geode/developing/data_serialization/use_pdx_serializer.html 具体来说,使用DataSerializer http://gemfire.docs.pivotal.io/geode/developing/data_serialization/gemfire_data_serialization.html如果您计划广泛使用查询而不是获取/ puts,请使用PDXSerializer: http ://gemfire.docs .pivotal.io /晶洞/开发/ data_serialization / use_pdx_serializer.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM