简体   繁体   中英

RabbitMQ Shovel Stuck in 'Terminated' Status

We have an issue where once in a while, a dynamic shovel (created via the HTTP API: /api/parameters/shovel/ ) with src-delete-after set to queue-length finishes and then instead of being deleted, gets stuck in a terminated status.

Subsequent attempts to delete the shovel by any of the following methods are unsuccessful:

  1. Posting a DELETE to /api/parameters/shovel/
  2. rabbitmqctl delete_shovel
  3. rabbitmqctl clear_parameter -p <vhost> shovel <shovel_name>

The shovel doesn't even appear in the 'Shovel Management' section of the RabbitMQ admin UI.

The only way we could get rid of that stuck shovel is by restarting RabbitMQ.

Is anyone else having this issue? If so, how do we clear the shovel without having to restart the cluster? Also, is it possible to prevent this from happening via configuration?

Thanks!

PS:

  1. RabbitMQ version: 3.4.4
  2. Running a 2 node cluster (will be making it a 3 node cluster shortly due to obvious issue we could face in the case of a network partition).

RabbitMQ Shovel 卡在“已终止”状态

You are using a very, very old version of RabbitMQ. Please upgrade to the latest version ( 3.7.6 ) and be sure to use Erlang 20.3.X (not 21 ). If you can still reproduce this issue, please report it on the rabbitmq-users mailing list.

We're using RMQ 3.7.13, Erlang 21.3.1.

One possible way the problem happens:

  • 3 node HA cluster
  • restart one of the nodes (/etc/init.d/rabbitmq-server restart)
  • old shovels AND old queues get resurrected somehow

The shovels can't be deleted using any of the ways mentioned in the question. The only way I was able to get the shovels removed was to disable the shovel plugin on all 3 nodes in the cluster, then re-enable each plugin on each node like so:

rabbitmq-plugins disable rabbit_shovel
rabbitmq-plugins enable rabbit_shovel
rabbitmq-plugins enable rabbit_shovel_management

As far as the old queues getting resurrected (happens randomly not touching anything... I call them "zombie" queues), this problem happens about once a month, so I created PostMan scripts to delete the resurrected queues. This has been a problem for years. We upgraded RMQ in hopes that fixes the issue... but it doesn't. Perhaps Quorum queues are a more robust solution? If I had more time to investigate/experiment, I would, but I'm buried up to my eyeballs with higher priorities.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM