简体繁体 English

撤消仲裁系统中的部分写入

[英]Undoing partial writes in quorum systems

原文 2021-01-01 16:04:26 0 2 distributed-system/ consensus/ quorum

Suppose a quorum system has 5 nodes, and write & read quorum number is 3. Now, say a client sends a writes request w, and w is replicated on 2/5 nodes.假设一个仲裁系统有 5 个节点，写入和读取的仲裁数为 3。现在，假设客户端发送一个写入请求 w，并且 w 被复制到 2/5 个节点上。 Since we did not replicate on at least 3/5 nodes we say to the client that the write was not successful.由于我们没有在至少 3/5 个节点上进行复制，我们告诉客户端写入不成功。 Now, immediately after, 2 nodes on which the write was not replicated go down.现在，紧接着，2 个未复制写入的节点 go 关闭。 So, out of the remaining 3 nodes, 2 have the partial write and 1 does not.因此，在剩余的 3 个节点中，2 个有部分写入，1 个没有。 In this case, how does the system figure out that the partial write w needs to be undone since it did not actually complete successfully?在这种情况下，系统如何确定部分写入 w 需要撤消，因为它实际上并没有成功完成？

2 个解决方案

In most systems 3/5 means that the value is chosen and must not be undone.在大多数系统中，3/5 表示该值已被选择且不得撤消。

Different protocols and systems deal with this in different ways.不同的协议和系统以不同的方式处理这个问题。

In ABD the next read will learn the value and from the one node and propagate it to other two remaining nodes.在ABD中，下一次读取将从一个节点学习该值并将其传播到其他两个剩余节点。

Similarly, in Paxos , the next proposer will learn the value from that one node and will propagate it to the other two.同样，在Paxos中，下一个提议者将从该节点学习值并将其传播到其他两个节点。 In practical systems based on Paxos there will be a system (perhaps the Leader) that will produce a no-op proposal to ensure that the value is propagated in a timely manner.在基于 Paxos 的实际系统中，会有一个系统（可能是 Leader）会产生一个no-op提议，以确保及时传播该值。

In leader-election/primary-replica systems like Raft the leader will ensure the propagation to the replicas.在像Raft这样的领导者选举/主副本系统中，领导者将确保传播到副本。 That is unless of course it is one of those that died.那当然是除非它是死者之一。 In that case the election process typically requires the new leader to be the most up-to-date, which in this case would include the disputed value.在这种情况下，选举过程通常要求新领导者是最新的，在这种情况下，这将包括有争议的值。

This is a common scenario you need to consider While building a distributed system to check on how to manage failures or events not getting sent to all needed nodes.这是您在构建分布式系统以检查如何管理未发送到所有需要的节点的故障或事件时需要考虑的常见场景。

It's the Leader's responsibility to make sure the events gets successfully sent to minimum quorum defined before the event is considered as committed [there should be a Flag with Hashcode of that events ID] and the event is not eligible for retry.领导者有责任确保在事件被视为已提交之前将事件成功发送到定义的最小法定人数[应该有一个带有该事件 ID 哈希码的标志]，并且该事件不符合重试条件。

When a request comes into a picture to fetch the data from distributed systems;当请求从分布式系统中获取数据时； it always goes to a Leader who will be delegating the request to near by Node by it's Hashcode.它总是交给领导者，领导者将通过它的哈希码将请求委托给节点附近。 But before to delegate, Leader has to make sure that Commit Flag is TRUE(means event is delivered to minumum defined nodes).但在委托之前，Leader 必须确保 Commit Flag 为 TRUE（意味着事件被传递到最少定义的节点）。 Otherwise, Leader can throw an Exception.否则，Leader 可以抛出异常。

Also, Leader should should Stamp a Version number to the event and keep the current committed flags Version number.此外，Leader 应该在事件上标记一个版本号并保留当前提交的标志版本号。 While retrieve the event, Leader can compare the event Versions to make sure all nodes are giving the latest and greatest data.在检索事件时，Leader 可以比较事件版本以确保所有节点都提供最新和最好的数据。

You can see Out of Box Functionality from Zookeeper etc systems.您可以从 Zookeeper 等系统中看到开箱即用的功能。