简体   繁体   English

RabbitMQ 集群节点故障与 spring boot 应用程序

[英]RabbitMQ cluster node failure with spring boot application

I have a spring boot application that is connected to a RabbitMQ cluster (as a service in cloud foundry).我有一个连接到 RabbitMQ 集群的 Spring Boot 应用程序(作为云代工厂中的服务)。 When the main node in the cluster fails and for some reason the node does not come up but the application (Message Consumer) was trying to connect to the failed node and does not try to connect to other available nodes.当集群中的主节点出现故障并且由于某种原因该节点没有出现但应用程序(消息使用者)试图连接到故障节点并且没有尝试连接到其他可用节点时。 Could someone suggest some spring configurations to fix this issue ?有人可以建议一些弹簧配置来解决这个问题吗?

17:36:23.829: [APP/PROC/WEB.0] Caused by: com.rabbitmq.client.ShutdownSignalException: channel error; protocol method: #method<channel.close>(reply-code=404, reply-text=NOT_FOUND - home node 'rabbit@rad33f2b1-mq-1.node.dc1.svvc' of durable queue 'FAILED_ORDER' in vhost '/' is down or inaccessible, class-id=50, method-id=10)

'rabbit@rad33f2b1-mq-1.node.dc1.svvc' is the failed node. 'rabbit@rad33f2b1-mq-1.node.dc1.svvc' 是故障节点。

In order to continuously try connecting to the nodes on failure, i have the following spring configuration.为了在失败时不断尝试连接到节点,我有以下弹簧配置。 spring.rabbitmq.listener.simple.missing-queues-fatal=false spring.rabbitmq.listener.simple.missing-queues-fatal=false

@Configuration
public class MessageConfiguration {

public static final String FAILED_ORDER_QUEUE_NAME = "FAILED_ORDER";

public static final String EXCHANGE = "directExchange";

@Bean
public Queue failedOrderQueue(){
    return new Queue(FAILED_ORDER_QUEUE_NAME);
}

@Bean
public DirectExchange directExchange(){
    return new DirectExchange(EXCHANGE,true,false);
}

@Bean
public Binding secondBinding(Queue failedOrderQueue, DirectExchange directExchange){
    return BindingBuilder.bind(failedOrderQueue).to(directExchange).with(FAILED_ORDER_QUEUE_NAME);
}

} }

This can happen when you are using a non-HA auto-delete queue with an incorrect master locator.当您使用具有不正确主定位器的非 HA 自动删除队列时,可能会发生这种情况。

If the master locator is not client-local , the auto-delete queue might be created on a different node to the one we are connected to.如果主定位器不是client-local ,则可能会在与我们连接的节点不同的节点上创建自动删除队列。 In that case, if the host node goes down, you will get this problem.在这种情况下,如果主机节点出现故障,您将遇到此问题。

To avoid this problem with auto-delete queues, set the x-queue-master-locator queue argument to client-local or set a policy on the broker to do the same for queues matching this name.为避免自动删除队列出现此问题,请将x-queue-master-locator queue 参数设置为client-local或在代理上设置策略以对匹配此名称的队列执行相同操作。

However, you are not using an auto-delete queue...但是,您没有使用自动删除队列...

@Bean
public Queue failedOrderQueue(){
    return new Queue(FAILED_ORDER_QUEUE_NAME);
}

When using a cluster, and a non-HA queue, the queue is not replicated and so, if the owning node goes down, you will get this error until the owning node comes back up.使用集群和非 HA 队列时,不会复制队列,因此,如果拥有节点出现故障,您将收到此错误,直到拥有节点恢复为止。

To avoid this problem, set a policy to make the queue a mirrored (HA) queue.要避免此问题,请设置策略以使队列成为镜像 (HA) 队列。

https://www.rabbitmq.com/ha.html https://www.rabbitmq.com/ha.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM