简体   繁体   中英

Spring Integration issue on RabbitMQ cluster restart

We have several RabbitMQ queues in our system and we use Spring Integration amqp:inbound-channel-adapter to consume the messages. The Spring application runs on 5 JBoss nodes (not in cluster)

RabbitMQ side is a two clustered nodes with a load balancer, with durable queues, on the application side listeners definition is quite simple with a connection factory defined as follows:

<rabbit:connection-factory id="amqpConnectionFactory" username="${orts.rabbitmq.username}" password="${orts.rabbitmq.password}"
host="${orts.rabbitmq.endpoint}" />

and several inbound-channel-adapter defined like the following:

<amqp:inbound-channel-adapter id="artiqAmqpInboundChannelAdapter"
  channel="artiq.queued.action.filter.outbound.channel" error-channel="artiq.recovery.router.channel"
connection-factory="amqpConnectionFactory" header-mapper="amqpHeaderMapper"
  queue-names="ortsArtiqQueue" />

We had experienced an unexpected behavior when for some reason (ie deploying a new configuration) we have to restart the RabbitMQ cluster, after restart it happens that one or more of the listeners stop consuming messages and we have to restart JBoss nodes to recover.

Note that this behavior is not bound to a specific queue, each time the impacted queues may be different. Also note that the new configuration deployed doesn't modify any of the existing queues (it happened for example when we added new queues)

the listeners stop consuming messages and we have to restart JBoss nodes to recover.

In my experience such problems are invariably because the listener container thread is "stuck" in some code downstream of the adapter.

To debug, next time it happens take a thread dump (eg with jstack ) and look at what the consumer threads are doing.

It doesn't sound like this is your problem, but we did recently fix a bug which caused a similar problem when adding/removing queues to/from an existing listener container. If you are not doing that, then that fix won't help you; you need to look at the thread dump to see what's happening.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM