简体繁体 English

paxos 与 raft 用于领导选举

[英]paxos vs raft for leader election

原文 2017-08-24 22:29:30 0 3 distributed-system/ consensus/ paxos/ raft

After reading paxos and raft paper, I have following confusion: paxos paper only describe consensus on single log entry, which is equivalent the leader election part of the raft algorithm.看了paxos和raft paper后，有以下困惑：paxos paper只描述了单条日志条目的共识，相当于raft算法的leader选举部分。 What's the advantage of paxos's approach over the simple random timeout approach in raft's leader election?在 raft 的 leader 选举中，paxos 的方法比简单的随机超时方法有什么优势？

3 个解决方案

It is a common misconception that the original Paxos papers don't use a stable leader.最初的 Paxos 论文不使用稳定的领导者，这是一个常见的误解。 In Paxos Made Simple on page 6 in the section entitled “The Implementation” Lamport wrote:在Paxos Made Simple第 6 页的“实现”一节中，Lamport 写道：

The algorithm chooses a leader, which plays the roles of the distinguished proposer and the distinguished learner.算法选择一个领导者，领导者扮演杰出提议者和杰出学习者的角色。

This is simply achieved using the Phase 1 messaging of prepare and promises.这可以通过使用准备和承诺的阶段 1 消息传递来实现。

Then on pages 9 and 10 under the section “Implementing a State Machine” we have:然后在第 9 页和第 10 页的“实现状态机”部分下，我们有：

In normal operation, a single server is elected to be the leader, which acts as the distinguished proposer (the only one that tries to issue proposals) in all instances of the consensus algorithm.在正常操作中，单个服务器被选为领导者，在共识算法的所有实例中充当杰出的提议者（唯一一个尝试发布提议的人）。

Here it is using the term 'state machine' in the most generic sense covering the obvious cases such as a key value store or database server where we replicate a log of actions applied to the store.在这里，它使用最通用的术语“状态机”，涵盖明显的情况，例如键值存储或数据库服务器，我们复制应用于存储的操作日志。

People get confused about this because of the way Lamport proved Paxos which is now the way it is taught.人们对此感到困惑，因为 Lamport 证明 Paxos 的方式是现在教授它的方式。 Lamport proved the correctness of a class of applications known as Paxos by stripping it down to a mathematical model that can be reasoned about. Lamport 证明了一类称为 Paxos 的应用程序的正确性，将其分解为一个可以推理的数学模型。 He called this “The Single-Decree Synod” in the original paper The Part-Time Parliament :他在原论文《兼职议会》中称其为“单一法令会议”：

Paxon religious leaders asked mathematicians to formulate a protocol for choosing the Synod's decree. Paxon 宗教领袖要求数学家制定一个协议来选择主教会议的法令。 The protocol's requirements and assumptions were essentially the same as those of the later Parliament except that instead of containing a sequence of decrees, a ledger would have at most one decree.该协议的要求和假设与后来的议会的要求和假设基本相同，只是分类帐不包含一系列法令，而最多只有一个法令。 The resulting Synod protocol is described here;此处描述了由此产生的 Synod 协议； the Parliamentary protocol is described in Section 3.议会协议在第 3 节中描述。

If you find that statement confusing don't worry it is a bad joke;如果你觉得那句话令人困惑，别担心，这是一个糟糕的笑话； literally.从字面上看。 A translation of this in my own words would be:用我自己的话翻译一下：

“In order to prove the correctness of the consensus algorithm for choosing a stream of commands we can first demonstrate the correctness of a mathematical model which chooses a single command. “为了证明选择命令流的共识算法的正确性，我们可以首先证明选择单个命令的数学模型的正确性。 The mathematical model for selecting a single command can then be extended to the practical algorithm for selecting a stream of commands (Section 3) as long as the invariants of the single command mathematical model are not violated.”只要不违反单个命令数学模型的不变量，用于选择单个命令的数学模型就可以扩展到用于选择命令流的实用算法（第 3 节）。” – simbo1905 – simbo1905

In order to justify my interpretation we can look at Section 3 entitled “The Multi-Decree Parliament” which says:为了证明我的解释是合理的，我们可以看看题为“多法令议会”的第 3 节，其中说：

Instead of passing just one decree, the Paxon Parliament had to pass a series of numbered decrees. Paxon 议会必须通过一系列编号的法令，而不是仅仅通过一项法令。 As in the Synod protocol, a president was elected.正如在大会协议中一样，选举了一位主席。 Anyone who wanted a decree passed would inform the president, who would assign a number to the decree and attempt to pass it.任何想要通过法令的人都会通知总统，总统会为该法令分配一个编号并试图通过它。 Logically, the parliamentary protocol used a separate instance of the complete Synod protocol for each decree number.从逻辑上讲，议会协议为每个法令编号使用了完整的 Synod 协议的单独实例。 However, a single president was selected for all these instances, and he performed the first two steps of the protocol just once.然而，所有这些实例都选择了一个总统，他只执行了协议的前两个步骤一次。

To labour the point both the original “The Part-Time Parliment” paper introducing Paxos as interesting to computer scientists because of its multi-degree algorithm;为了说明这一点，最初的“The Part-Time Parliment”论文介绍了 Paxos，因为它的多度算法对计算机科学家来说很有趣； the parliament protocol.议会协议。 That and the clarification paper “Paxos Made Simple” both define Paxos as having a distinguished leader assigning sequence numbers to a stream of commands.那篇文章和澄清文件“Paxos Made Simple”都将 Paxos 定义为有一位杰出的领导者为命令流分配序列号。 Furthermore, the distinguished leader only sends “prepare” messages when it assumes leadership;此外，尊贵的领导者只有在担任领导职务时才会发送“准备”消息； after that in steady-state the distinguished leader streams only “accept” messages.在那之后，在稳定状态下，杰出的领导者只流“接受”消息。 He also says elsewhere in the paper to collapse the roles and have all servers run all three roles of the algorithm.他还在论文的其他地方说要折叠角色并让所有服务器运行算法的所有三个角色。

Where you ask "What's the advantage of Paxos's approach over the simple random timeout approach in raft's leader election?"你问“在 raft 的领导人选举中，Paxos 的方法比简单的随机超时方法有什么优势？” I am not sure what approach you are referring to?我不确定你指的是什么方法？ With Paxos, you can just randomise the timeouts and issue Prepare messages.使用 Paxos，您可以随机化超时并发出 Prepare 消息。 The Paxos Made Simple paper indicates that you are free to timeouts or some other faster mechanism long as you follow the protocol which will ensure safety: Paxos Made Simple 论文表明，只要您遵循确保安全的协议，您就可以自由地超时或其他一些更快的机制：

The famous result of Fischer, Lynch, and Pat- terson 1 implies that a reliable algorithm for electing a proposer must use either randomness or real time—for example, by using timeouts. Fischer、Lynch 和 Patterson 的著名结果1意味着用于选举提议者的可靠算法必须使用随机性或实时性——例如，使用超时。 However, safety is ensured regardless of the success or failure of the election.但是，无论选举成败，安全都是有保障的。

Randomised timeouts are very easy to code and are very understandable.随机超时很容易编码并且很容易理解。 Yet in the worse case, they can lead to a long delay in recovery.然而，在更糟糕的情况下，它们可能导致恢复的长时间延迟。 You don't like that you can use your own leader election mechanism.你不喜欢你可以使用你自己的领导人选举机制。 For example this one .比如这个。

After reading the question and @simbo1905's answer, I feel like I have to throw in my 2 cents as I don't think the question has been answered.阅读问题和@simbo1905 的回答后，我觉得我必须投入 2 美分，因为我认为问题没有得到解答。

What's the advantage of paxos's approach over the simple random timeout approach in raft's leader election?在 raft 的 leader 选举中，paxos 的方法比简单的随机超时方法有什么优势？

tl;dr: Paxos is optimal, but Raft has stronger practical guarantees of liveness. TL; DR：的Paxos是最佳的，但具有筏的活跃度较强的实用性保证。

For more information, read on.有关更多信息，请继续阅读。

As Lamport states in section 3 of Paxos Made Simple ,正如 Lamport 在Paxos Made Simple 的第 3 节中所述，

It can be shown that phase 2 of the Paxos consensus algorithm has the minimum possible cost of any algorithm for reaching agreement in the presence of faults [2] .可以证明，Paxos 共识算法的第 2 阶段在出现故障的情况下达成一致的任何算法的成本可能最低[2] 。 Hence, the Paxos algorithm is essentially optimal.因此，Paxos 算法本质上是最优的。

So Paxos implements consensus in a way that is maximally efficient when there are no faults.所以 Paxos 以一种在没有故障时效率最高的方式来实现共识。

On the other hand in the same section he also states另一方面，在同一部分，他还指出

If multiple servers think they are leaders, then they can all propose values in the same instance of the consensus algorithm, which could prevent any value from being chosen.如果多个服务器认为他们是领导者，那么他们都可以在共识算法的同一个实例中提出值，这可能会阻止任何值被选中。 However, safety is preserved—two different servers will never disagree on the value chosen as the ith state machine command.但是，安全性得到了保护——两个不同的服务器永远不会在选择作为第 i 个状态机命令的值上存在分歧。 Election of a single leader is needed only to ensure progress.只需要选举一个领导人来确保进展。

Which means practically Paxos can see violations of it's liveness guarantee.这意味着实际上Paxos 可以看到违反其活性保证的情况。

Raft IS Paxos Raft 是 Paxos

8-O 8-O

...more specifically the multi-decree version of it with different naming. ...更具体地说，它具有不同命名的多法令版本。