简体繁体 English

Hazelcast 存储大型 object 导致高延迟问题

[英]Hazelcast stores large object cause high latency issue

原文 2019-11-01 11:47:33 5 1 java/ hazelcast

Our restful web service uses hazelcast 3.4.2.我们宁静的 web 服务使用 hazelcast 3.4.2。 We use IQueue to store byte[] object.我们使用 IQueue 来存储 byte[] object。 One thread puts object in the queue, the other thread gets the object from the queue.一个线程将 object 放入队列，另一个线程从队列中获取 object。 We use Jmeter to do some load test with 50 threads.我们使用 Jmeter 进行了 50 个线程的负载测试。 When the object is small (under 10k), it works well, the application response time is always under 50ms and the CPU is low.当 object 很小（低于 10k）时，它运行良好，应用程序响应时间始终低于 50ms，CPU 较低。 When the object is larger, for example 90k, the response time will go to 500ms.当 object 较大时，例如 90k，响应时间将 go 到 500ms。 When the object is 250k, the response time will be 2500ms. object为250k时，响应时间为2500ms。 At the same time, the CPU uses 60%-80%.同时CPU占用60%-80%。 The memory is about 60%-80%. memory 约为 60%-80%。

Our test server is AWS m5.large:2 core and 8G.我们的测试服务器是 AWS m5.large:2 core 和 8G。 Tomcat allocates 6G memory. Tomcat分配6G memory。

We try to fix the issue with these ways:我们尝试通过以下方式解决问题：

Upgrade the hazelcast to 3.12.将 hazelcast 升级到 3.12。
Change backup-count to 0.将备份计数更改为 0。

The issue is not fixed yet.这个问题还没有解决。

Here is high latency hazelcast health monitor in prod, the prod server is 8 cores and 16G memory, tomcat allocate 12G memory:这是prod中的高延迟hazelcast健康监视器，prod服务器是8核和16G memory，tomcat分配12G memory：

2019-10-25 15:11:13.078][INFO][com.hazelcast.util.HealthMonitor] [ec1-12]:5701 [dev] [3.4.2] processors=8, physical.memory.total=14.7G, physical.memory.free=183.5M, swap.space.total=0, swap.space.free=0, heap.memory.used=1.8G, heap.memory.free=8.8G, heap.memory.total=10.7G, heap.memory.max=10.7G, heap.memory.used/total=17.31%, heap.memory.used/max=17.31%, minor.gc.count=129321, minor.gc.time=2021051ms, major.gc.count=1224, major.gc.time=165825ms, load.process=80.00%, load.system=79.00%, load.systemAverage=469.00%, thread.count=153, thread.peakCount=198, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.oper 2019-10-25 15:11:13.078][INFO][com.hazelcast.util.HealthMonitor] [ec1-12]:5701 [dev] [3.4.2] 处理器=8，物理.memory.total=14.7G , physical.memory.free=183.5M, swap.space.total=0, swap.space.free=0, heap.memory.used=1.8G, heap.memory.free=8.8G, heap.memory.total= 10.7G, heap.memory.max=10.7G, heap.memory.used/total=17.31%, heap.memory.used/max=17.31%, minor.gc.count=129321, minor.gc.time=2021051ms, major.gc.count=1224，major.gc.time=165825ms，load.process=80.00%，load.system=79.00%，load.systemAverage=469.00%，thread.count=153，thread.peakCount=198，事件.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q .io.size=0, executor.q.system.size=0, executor.q.oper ation.size=0, executor.q.priorityOperation.size=0, executor.q.response.size=0, operations.remote.size=8, operations.running.size=2, proxy.count=8, clientEndpoint.count=0, connection.active.count=2, client.connection.count=0, connection.count=2 ation.size=0，executor.q.priorityOperation.size=0，executor.q.response.size=0，operations.remote.size=8，operations.running.size=2，proxy.count=8，clientEndpoint。 count=0，connection.active.count=2，client.connection.count=0，connection.count=2

1 个解决方案

This issue is not related to Hazelcast.此问题与 Hazelcast 无关。 What you are seeing is a typical case of high latencies of serializing/deserializing a large object.您所看到的是对大型 object 进行序列化/反序列化的高延迟的典型案例。

In any system where the data needs to travel from one point to another, the data will have to be serialized at origin, travel through the network and may get deserialized (not in your case, depends on configuration) at destination when sent for storage.在数据需要从一个点传输到另一个点的任何系统中，数据必须在源头序列化，通过网络传输，并且在发送存储时可能在目的地反序列化（不是您的情况，取决于配置）。 When you retrieve, the data will be serialized (if deserialized previously) at origin, traverse through the network and deserialized at the destination.当您检索时，数据将在源处序列化（如果之前反序列化），遍历网络并在目的地反序列化。 In your case, your app is spending most of its time in ser/des as is evident by CPU usage.在您的情况下，您的应用程序大部分时间都花在 ser/des 上，这从 CPU 使用率可以看出。

The only ways to reduce latency are: 1. use Hazelcast Serialization, read here - https://docs.hazelcast.org/docs/3.12.4/manual/html-single/index.html#serialization 2. reduce the size of object减少延迟的唯一方法是： 1. 使用 Hazelcast 序列化，在此处阅读 - https://docs.hazelcast.org/docs/3.12.4/manual/html-single/index.html#serialization 2. 减小object