简体   繁体   中英

Infinispan clustered REPL_ASYNC cache: command indefinitely bounced between two nodes

Im running a spring boot application using infinispan 10.1.8 in a 2 node cluster. The 2 nodes are communicating via jgroups TCP. I configured several REPL_ASYNC.

The problem: One of these caches, at some point is causing the two nodes to exchange the same message over and over, causing high CPU and memory usage. The only way to stop this is to stop one of the two nodes.

More details, here is the configuration.

 org.infinispan.configuration.cache.Configuration replAsyncNoExpirationConfiguration = new ConfigurationBuilder()
                .clustering()
                    .cacheMode(CacheMode.REPL_ASYNC)
                .transaction()
                    .lockingMode(LockingMode.OPTIMISTIC)
                .transactionMode(TransactionMode.NON_TRANSACTIONAL)
                .statistics().enabled(cacheInfo.isStatsEnabled())
                .locking()
                    .concurrencyLevel(32)
                    .lockAcquisitionTimeout(15, TimeUnit.SECONDS)
                    .isolationLevel(IsolationLevel.READ_COMMITTED)
                .expiration()
                    .lifespan(-1) //entries do not expire
                    .maxIdle(-1) // even when they are idle for some time
                    .wakeUpInterval(-1) // disable the periodic eviction process
                .build();

One of these caches (named formConfig ) is causing me abnormal communication between the two nodes, this is what happens:

  1. with jmeter I generate traffic load targeting only node 1
  2. for some time node 2 receives cache entries from node 1 via SingleRpcCommand, no anomalies, even formConfig cache behaves properly
  3. after some time a new cache entry is sent to the formConfig cache

At this point the same message seems to keep bouncing between the two nodes:

  • node 1 sends entry mn-node1.company.acme-develop sending command to all: SingleRpcCommand{cacheName='formConfig', command=PutKeyValueCommand{key=SimpleKey [form_config,MECHANICAL,DESIGN,et,7850]
  • node 2 receives the entry mn-node2.company.acme-develop received command from mn-node1.company.acme-develop: SingleRpcCommand{cacheName='formConfig', command=PutKeyValueCommand{key=SimpleKey [form_config,MECHANICAL,DESIGN,et,7850]
  • node 2 sends the entry back to node 1 mn-node2.company.acme-develop sending command to all: SingleRpcCommand{cacheName='formConfig', command=PutKeyValueCommand{key=SimpleKey [form_config,MECHANICAL,DESIGN,et,7850]
  • node 1 receives the entry mn-node1.company.acme-develop received command from mn-node2.company.acme-develop: SingleRpcCommand{cacheName='formConfig', command=PutKeyValueCommand{key=SimpleKey [form_config,MECHANICAL,DESIGN,et,7850],
  • node 1 sends the entry to node 2 and so on and on...

Some other things:

  • the system is not under load, jmeter is running only few users in parallel
  • Even stopping jmeter this loop doesn't stop
  • formConfig is the only cache that behaves this way. All the other REPL_ASYNC caches work properly. I deactivated only formConfig cache and the system is working correctly.
  • I cannot reproduce the problem with two nodes running on my machine

Here's a more complete log file including logs from both nodes.

Other infos:

  • opendjdk 11 hot spot
  • spring boot 2.2.7
  • infinispan spring boot starter 2.2.4
  • using JbossUserMarshaller

I'm suspecting

  • something related to transactional configuration
  • or something related to serialization/deserialization of the cached object

The only scenario where this can happen is when the SimpleKey has different hashCode() .

Are there any exceptions in the log? Are you able to check if the hashCode() is the same after serialization & deserialization of the key?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM