网络故障后RabbitMQ集群未重新连接

Question

I have RabbitMQ cluster with two nodes in production and the cluster is breaking with these error messages: 我有在生产中有两个节点的RabbitMQ集群，并且该集群因以下错误消息而中断：

=ERROR REPORT==== 23-Dec-2011::04:21:34 === =错误报告==== 2011年12月23日:: 04：21：34 ===
** Node rabbit@rabbitmq02 not responding ** **节点rabbit @ rabbitmq02没有响应**
** Removing (timedout) connection ** **移除（逾时）连线**

=INFO REPORT==== 23-Dec-2011::04:21:35 === = INFO报告==== 2011年12月23日:: 04：21：35 ===
node rabbit@rabbitmq02 lost 'rabbit' 节点rabbit @ rabbitmq02丢失了“兔子”

=ERROR REPORT==== 23-Dec-2011::04:21:49 === =错误报告==== 2011年12月23日:: 04：21：49 ===
Mnesia(rabbit@rabbitmq01): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, rabbit@rabbitmq02} Mnesia（rabbit @ rabbitmq01）：**错误** mnesia_event得到了{数据库不一致，运行分区网络，rabbit @ rabbitmq02}

I tried to simulate the problem by killing the connection between the two nodes using "tcpkill", the cluster has disconnected,and surprisingly the two nodes are not trying to reconnect ! 我试图通过使用“ tcpkill”杀死两个节点之间的连接来模拟该问题，群集已断开连接，并且令人惊讶的是，两个节点没有尝试重新连接！

When the cluster breaks, haproxy load balancer still marks both nodes as active and send request to both of them, although they are not in a cluster. 当群集中断时，haproxy负载均衡器仍将两个节点都标记为活动节点，并向两个节点发送请求，尽管它们不在群集中。

My questions: 我的问题：

If the nodes are configured to work as a cluster, when I get a network failure , why aren't they trying to reconnect after ? 如果将节点配置为作为群集工作，那么当我遇到网络故障时，为什么以后不尝试重新连接？
How can I identify broken cluster and shutdown one of the nodes ? 如何识别损坏的群集并关闭其中一个节点？ I have consistency problems when working with the two nodes separately. 分别使用两个节点时，我遇到一致性问题。

Answer 1

RabbitMQ Clusters do not work well on unreliable networks (part of RabbitMQ documentation). RabbitMQ群集无法在不可靠的网络上正常运行（RabbitMQ文档的一部分）。 So when the network failure happens (in a two node cluster) each node thinks that it is the master and the only node in the cluster. 因此，当发生网络故障时（在两个节点的群集中），每个节点都认为它是主节点，并且是群集中的唯一节点。 Two master nodes don't automatically reconnect, because their states are not automatically synchronized (even in case of a RabbitMQ slave - the actual message synchronization does not happen - the slave just "catches up" as messages get consumed from the queue and more messages get added). 两个主节点不会自动重新连接，因为它们的状态不会自动同步（即使在RabbitMQ从站的情况下-实际的消息同步也不会发生-当从队列中吸收消息和更多消息时，从站只会“追赶”）被添加）。

To detect whether you have a broken cluster, run the command: 要检测集群是否损坏，请运行以下命令：

rabbitmqctl cluster_status

on each of the nodes that form part of the cluster. 在组成集群一部分的每个节点上。 If the cluster is broken then you'll only see one node. 如果群集损坏，那么您只会看到一个节点。 Something like: 就像是：

Cluster status of node rabbit@rabbitmq1 ...
[{nodes,[{disc,[rabbit@rabbitmq1]}]},{running_nodes,[rabbit@rabbitmq1]}]
...done.

In such cases, you'll need to run the following set of commands on one of the nodes that formed part of the original cluster (so that it joins the other master node (say rabbitmq1) in the cluster as a slave): 在这种情况下，您需要在组成原始集群一部分的一个节点上运行以下命令集（以便它作为从节点加入集群中的另一个主节点（例如rabbitmq1））：

rabbitmqctl stop_app

rabbitmqctl reset

rabbitmqctl join_cluster rabbit@rabbitmq1

rabbitmqctl start_app

Finally check the cluster status again .. this time you should see both the nodes. 最后再次检查群集状态..这次您应该看到两个节点。

Note: If you have the RabbitMQ nodes in an HA configuration using a Virtual IP (and the clients are connecting to RabbitMQ using this virtual IP), then the node that should be made the master should be the one that has the Virtual IP. 注意：如果在HA配置中使用虚拟IP的RabbitMQ节点（并且客户端使用该虚拟IP连接到RabbitMQ），则应成为主节点的节点应该是具有虚拟IP的节点。

Answer 2

RabbitMQ also offers two ways to deal with network partitions automatically: pause-minority mode and autoheal mode. RabbitMQ还提供了两种自动处理网络分区的方法：暂停少数模式和自动修复模式。 (The default behaviour is referred to as ignore mode). （默认行为称为忽略模式）。

In pause-minority mode RabbitMQ will automatically pause cluster nodes which determine themselves to be in a minority (ie fewer or equal than half the total number of nodes) after seeing other nodes go down. 在暂停少数模式下，RabbitMQ将在看到其他节点出现故障后自动暂停群集节点，这些群集节点确定自己处于少数状态（即少于或等于节点总数的一半）。 It therefore chooses partition tolerance over availability from the CAP theorem. 因此，它从CAP定理中选择分区容限而不是可用性。 This ensures that in the event of a network partition, at most the nodes in a single partition will continue to run. 这样可以确保在出现网络分区的情况下，单个分区中的最多节点将继续运行。

In autoheal mode RabbitMQ will automatically decide on a winning partition if a partition is deemed to have occurred. 在自动修复模式下，RabbitMQ将自动确定胜出分区，如果该分区被视为已发生。 It will restart all nodes that are not in the winning partition. 它将重新启动不在胜出分区中的所有节点。 The winning partition is the one which has the most Automatically handling partitions clients connected (or if this produces a draw, the one with the most nodes; and if that still produces a draw then one of the partitions is chosen in an unspecified way). 获胜的分区是连接了最多自动处理客户端的分区的分区 （或者，如果产生抽奖，则节点数最多的分区；如果仍然产生抽奖，则以一种未指定的方式选择分区之一）。

You can enable either mode by setting the configuration parameter cluster_partition_handling for the rabbit application in your configuration file to either pause_minority or autoheal . 您可以通过将配置文件中pause_minority应用程序的配置参数cluster_partition_handling设置为pause_minority或autoheal来启用任何一种模式。

Which mode should I pick? 我应该选择哪种模式？

It's important to understand that allowing RabbitMQ to deal with network partitions automatically does not make them less of a problem. 重要的是要理解，允许RabbitMQ自动处理网络分区不会使它们成为一个问题。 Network partitions will always cause problems for RabbitMQ clusters; 网络分区将始终对RabbitMQ集群造成问题； you just get some degree of choice over what kind of problems you get. 您只会在某种程度上选择遇到的问题。 As stated in the introduction, if you want to connect RabbitMQ clusters over generally unreliable links, you should use the federation plugin or the shovel plugin . 如简介中所述，如果要通过通常不可靠的链接连接RabbitMQ集群，则应使用federation插件或shovel插件。

With that said, you might wish to pick a recovery mode as follows: 话虽如此，您可能希望选择一种恢复模式，如下所示：

ignore: Your network really is reliable. 忽略：您的网络确实可靠。 All your nodes are in a rack, connected with a switch, and that switch is also the route to the outside world. 您所有的节点都在机架中，并与一台交换机相连，该交换机也是通往外界的路由。 You don't want to run any risk of any of your cluster shutting down if any other part of it fails (or you have a two node cluster). 如果群集中的任何其他部分发生故障（或者您有两个节点的群集），您都不希望任何群集关闭的风险。
pause_minority: Your network is maybe less reliable. pause_minority：您的网络可能不太可靠。 You have clustered across 3 AZs in EC2, and you assume that only one AZ will fail at once. 您已在EC2中跨3个可用区群集，并假设一次只有一个可用区将失败。 In that scenario you want the remaining two AZs to continue working and the nodes from the failed AZ to rejoin automatically and without fuss when the AZ comes back. 在这种情况下，您希望其余两个可用区继续工作，并且要使发生故障的可用区中的节点自动重新加入，并且在可用区恢复时不会大惊小怪。
autoheal: Your network may not be reliable. 自动修复：您的网络可能不可靠。 You are more concerned with continuity of service than with data integrity. 您更关心服务的连续性，而不是数据完整性。 You may have a two node cluster. 您可能有两个节点的群集。

This answer is ref from rabbitmq docs. 这个答案是来自rabbitmq文档的参考。 https://www.rabbitmq.com/partitions.html will give you a more detailed description. https://www.rabbitmq.com/partitions.html将为您提供更详细的描述。

Answer 3

One other way to recover from this kind of failure is to work with Mnesia which is the database that RabbitMQ uses as the persistence mechanism and for the synchronization of the RabbitMQ instances (and the master / slave status) are controlled by this. 从这种故障中恢复的另一种方法是与Mnesia一起使用，Mnesia是RabbitMQ用作持久性机制的数据库，并由此来控制RabbitMQ实例（以及主/从状态）的同步。 For all the details, refer to the following URL: http://www.erlang.org/doc/apps/mnesia/Mnesia_chap7.html 有关所有详细信息，请参见以下URL： http : //www.erlang.org/doc/apps/mnesia/Mnesia_chap7.html

Adding the relevant section here: 在此处添加相关部分：

There are several occasions when Mnesia may detect that the network has been partitioned due to a communication failure. 在多种情况下，Mnesia可能会检测到由于通信故障而导致网络已分区。

One is when Mnesia already is up and running and the Erlang nodes gain contact again. 一种是当Mnesia已经启动并运行并且Erlang节点再次获得联系时。 Then Mnesia will try to contact Mnesia on the other node to see if it also thinks that the network has been partitioned for a while. 然后，Mnesia将尝试与另一个节点上的Mnesia联系，以查看是否也认为网络已经分区了一段时间。 If Mnesia on both nodes has logged mnesia_down entries from each other, Mnesia generates a system event, called {inconsistent_database, running_partitioned_network, Node} which is sent to Mnesia's event handler and other possible subscribers. 如果两个节点上的Mnesia彼此都记录了mnesia_down条目，则Mnesia会生成一个名为{inconsistent_database，running_partitioned_network，Node}的系统事件，该事件将发送到Mnesia的事件处理程序和其他可能的订阅者。 The default event handler reports an error to the error logger. 默认事件处理程序将错误报告给错误记录器。

Another occasion when Mnesia may detect that the network has been partitioned due to a communication failure, is at start-up. Mnesia可能由于通信故障而检测到网络已分区的另一个情况是启动。 If Mnesia detects that both the local node and another node received mnesia_down from each other it generates a {inconsistent_database, starting_partitioned_network, Node} system event and acts as described above. 如果Mnesia检测到本地节点和另一个节点都从彼此接收mnesia_down，则它会生成{inconsistent_database，starting_partitioned_network，Node}系统事件，并如上所述进行操作。

If the application detects that there has been a communication failure which may have caused an inconsistent database, it may use the function mnesia:set_master_nodes(Tab, Nodes) to pinpoint from which nodes each table may be loaded. 如果应用程序检测到可能导致数据库不一致的通信故障，则可以使用函数mnesia：set_master_nodes（Tab，Nodes）来确定可以从哪个节点加载每个表。

At start-up Mnesia's normal table load algorithm will be bypassed and the table will be loaded from one of the master nodes defined for the table, regardless of potential mnesia_down entries in the log. 在启动时，Mnesia的普通表加载算法将被绕过，并且将从为该表定义的主节点之一加载该表，而不考虑日志中可能存在的mnesia_down条目。 The Nodes may only contain nodes where the table has a replica and if it is empty, the master node recovery mechanism for the particular table will be reset and the normal load mechanism will be used when next restarting. 节点只能包含表具有副本的节点，如果表为空，则将重置特定表的主节点恢复机制，并在下次重新启动时使用正常的加载机制。

The function mnesia:set_master_nodes(Nodes) sets master nodes for all tables. 函数mnesia：set_master_nodes（Nodes）设置所有表的主节点。 For each table it will determine its replica nodes and invoke mnesia:set_master_nodes(Tab, TabNodes) with those replica nodes that are included in the Nodes list (ie TabNodes is the intersection of Nodes and the replica nodes of the table). 对于每个表，它将确定其副本节点，并使用“节点”列表中包含的那些副本节点调用mnesia：set_master_nodes（Tab，TabNodes）（即TabNodes是节点与表的副本节点的交集）。 If the intersection is empty the master node recovery mechanism for the particular table will be reset and the normal load mechanism will be used at next restart. 如果交集为空，则将重置特定表的主节点恢复机制，并在下次重新启动时使用常规加载机制。

The functions mnesia:system_info(master_node_tables) and mnesia:table_info(Tab, master_nodes) may be used to obtain information about the potential master nodes. 函数mnesia：system_info（master_node_tables）和mnesia：table_info（Tab，master_nodes）可用于获取有关潜在主节点的信息。

Determining which data to keep after communication failure is outside the scope of Mnesia. 确定通讯失败后要保留哪些数据不在Mnesia的范围内。 One approach would be to determine which "island" contains a majority of the nodes. 一种方法是确定哪个“岛”包含大多数节点。 Using the {majority,true} option for critical tables can be a way of ensuring that nodes that are not part of a "majority island" are not able to update those tables. 对关键表使用{majority，true}选项可以确保不属于“多数岛”的节点无法更新这些表。 Note that this constitutes a reduction in service on the minority nodes. 注意，这减少了少数节点上的服务。 This would be a tradeoff in favour of higher consistency guarantees. 这是对更高一致性保证的权衡。

The function mnesia:force_load_table(Tab) may be used to force load the table regardless of which table load mechanism is activated. 函数mnesia：force_load_table（Tab）可用于强制加载表，而不管激活了哪种表加载机制。

This is a more lengthy and involved way of recovering from such failures .. but will give better granularity and control over data that should be available in the final master node (this can reduce the amount of data loss that might happen when "merging" RabbitMQ masters). 这是从此类故障中恢复的更冗长和复杂的方法..但是，它将提供更好的粒度和对最终主节点中应可用数据的控制（这可以减少“合并” RabbitMQ时可能发生的数据丢失大师）。

网络故障后RabbitMQ集群未重新连接

问题描述

3 个解决方案

解决方案1
14 2012-01-10 07:51:52

解决方案2
9 2014-04-22 15:53:58

解决方案3
9 已采纳 2012-01-10 08:53:07

网络故障后RabbitMQ集群未重新连接

问题描述

3 个解决方案

解决方案1 14 2012-01-10 07:51:52

解决方案2 9 2014-04-22 15:53:58

解决方案3 9 已采纳 2012-01-10 08:53:07

解决方案1
14 2012-01-10 07:51:52

解决方案2
9 2014-04-22 15:53:58

解决方案3
9 已采纳 2012-01-10 08:53:07