简体繁体 English

这些数据库管理系统在网络分区期间实际表现如何？

[英]How do these database management systems practically behave during a network partition?

原文 2022-05-04 15:39:12 9 1 sql/ mongodb/ cassandra/ nosql/ cockroachdb

I am looking into deploying a database management system, replicated across regions (various data centers across a country).我正在考虑部署一个数据库管理系统，跨地区（一个国家的各种数据中心）复制。 I am currently looking into the following candidates:我目前正在寻找以下候选人：

MongoDB (NoSQL, CP system) MongoDB（NoSQL，CP系统）
Cockroach (SQL, CP system)蟑螂（SQL、CP系统）
Cassanda (NoSQL, AP system) Cassanda（NoSQL，AP系统）

How do those three behave during a network partition between nodes?在节点之间的网络分区期间，这三个行为如何？ Let's assume we deploy all of them in 3-node clusters.假设我们将它们全部部署在 3 节点集群中。

What happens if the 2 secondary/follower nodes become separated from their leader during a network failure?如果 2 个辅助/从属节点在网络故障期间与它们的领导者分离，会发生什么情况？

Will MongoDB and Cockroach block reads during a network partition? MongoDB 和 Cockroach 会在网络分区期间阻止读取吗？ If so, for the entire duration of the partition or only during leader election (Cockroach)?如果是这样，是在整个分区期间还是仅在领导者选举期间（Cockroach）？

Will Cassandra allow reads during a network partition? Cassandra 是否允许在网络分区期间进行读取？

1 个解决方案

The answer for all three is, in theory, the same: it's up to the application making the read request.理论上，这三个问题的答案都是相同的：取决于应用程序发出读取请求。 You can choose either availability (the read succeeds but could be out of date) or consistency (the read generally fails).您可以选择可用性（读取成功但可能已过时）或一致性（读取通常失败）。 The details vary among the three, as do the degree to which the databases are able to actually honor the guarantees they make.这三者之间的细节各不相同，数据库能够实际兑现他们做出的保证的程度也是如此。

Cassandra卡桑德拉

Cassandra in theory: Cassandra reads and writes specify how many nodes need to acknowledge the request in order for it to be considered successful.理论上的 Cassandra： Cassandra 读取和写入指定需要多少节点来确认请求才能被视为成功。 This allows you to tune your consistency, availability, and throughput requirements to individual workloads.这使您可以根据各个工作负载调整一致性、可用性和吞吐量要求。 For strong consistency in an N-node cluster, you can require a total of N+1 acks across both reads and writes.为了在 N 节点集群中实现强一致性，您可以要求在读取和写入之间总共有 N+1 个确认。 In your 3 node example, you could require all 3 nodes to ack for a write, and only 1 for a read.在您的 3 节点示例中，您可以要求所有 3 个节点都确认写入，而只有 1 个节点用于读取。 In this case, writes can't be accepted during any network partition, so reads can without sacrificing consistency.在这种情况下，在任何网络分区期间都不能接受写入，因此可以在不牺牲一致性的情况下进行读取。 Or you could require 3 nodes for a read and only 1 for a write, reversing the availability.或者，您可能需要 3 个节点进行读取，而仅需要 1 个节点进行写入，从而颠倒可用性。 More commonly, applications tend to require a majority for both reads and writes: 2 nodes each in this case.更常见的是，应用程序往往需要大多数读写操作：在这种情况下，每个节点需要 2 个节点。 This means that both reads and writes can fail during a network partition, but can maximize overall performance.这意味着在网络分区期间读取和写入都可能失败，但可以最大限度地提高整体性能。 It's also common to just require 1 ack for all queries and live with some inconsistency.对于所有查询只需要 1 个 ack 并且存在一些不一致也是很常见的。

Cassandra in practice: You're going to have to live with some inconsistency regardless. Cassandra 在实践中：无论如何，你将不得不忍受一些不一致。 Cassandra generally doesn't pass the Jepsen test suite for detecting inconsistent writes; Cassandra 通常不会通过用于检测不一致写入的 Jepsen 测试套件； under heavy load and a network partition you're likely to end up with some corrupted data even when requesting otherwise.在重负载和网络分区下，即使请求其他方式，您也可能最终得到一些损坏的数据。

MongoDB MongoDB

MongoDB in theory: MongoDB has a primary and secondary nodes. MongoDB理论上： MongoDB有主节点和次节点。 If you enable secondary reads, you get data that could be out of date.如果启用二次读取，您将获得可能已过期的数据。 If you don't, read attempts only go to the primary node, so if you're cut off from that some reads will fail until MongoDB recovers.如果你不这样做，读取尝试只会转到主节点，所以如果你被切断，一些读取将失败，直到 MongoDB 恢复。

MongoDB in practice: Historically, MongoDB has not done well when its consistency is tested--its earlier versions use a protocol that is considered fundamentally flawed , leading to stale and dirty reads even when requesting full consistency.实践中的 MongoDB：从历史上看，MongoDB 在一致性测试时表现不佳——其早期版本使用的协议被认为存在根本缺陷，即使在请求完全一致性时也会导致陈旧和脏读。 As of 2017, it tentatively seemed like they had fixed those issues with a new protocol .截至 2017 年，他们暂时似乎已经通过新协议解决了这些问题。 Of these three, Mongo's the one I haven't worked with directly so I'll leave it at that.在这三个中，Mongo 是我没有直接使用过的那个，所以我将把它留在那里。

CockroachDB蟑螂数据库

CockroachDB in theory: By default, CockroachDB chooses consistency. CockroachDB 理论上：默认情况下，CockroachDB 选择一致性。 If you're lucky, some reads in the first 9 seconds of a network partition will hit the node that acquired a 9-second lease on all the data needed to serve the request.如果幸运的话，网络分区的前 9 秒内的一些读取将命中获得服务请求所需的所有数据的 9 秒租约的节点。 As long as the nodes can't establish a quorum, they can't create new leases, so eventually all reads start failing as no one node can be confident that the other two nodes aren't accepting new writes.只要节点不能建立仲裁，它们就不能创建新的租约，因此最终所有读取都开始失败，因为没有一个节点可以确信其他两个节点不接受新的写入。 However, Cockroach allows "bounded staleness reads" that can be served without a lease.但是，Cockroach 允许“有限的陈旧读取”，无需租约即可提供服务。 Queries of the form SELECT code FROM promo_codes AS OF SYSTEM TIME with_max_staleness('10s') will continue to succeed for 10-19 seconds into a network partition. SELECT code FROM promo_codes AS OF SYSTEM TIME with_max_staleness('10s')形式的查询将继续成功进入网络分区 10-19 秒。

CockroachDB in practice: CockroachDB brought in Aphyr, the researcher whose Jepsen analyses I linked above, early on it its development process . CockroachDB 在实践中： CockroachDB 在其开发过程的早期引入了 Aphyr，他是我上面链接的 Jepsen 分析的研究员。 It now runs nightly Jepsen tests simulating a network partition under load and verifying consistency, so it's unlikely to violate its consistency guarantee in that particular way.它现在每晚运行 Jepsen 测试，模拟负载下的网络分区并验证一致性，因此不太可能以这种特定方式违反其一致性保证。

Summary概括

All three databases make an effort to support choosing either consistency or availability.所有三个数据库都努力支持选择一致性或可用性。 Reads in "consistent mode" will start failing during a network partition until a majority of nodes reestablish communication with each other. “一致模式”下的读取将在网络分区期间开始失败，直到大多数节点重新建立相互通信。 Reads in "availability mode" will be less likely to fail during a network partition, but there's a risk you're reading from one isolated node while the other two have reestablished communication with each other and started accepting new writes. “可用性模式”下的读取在网络分区期间失败的可能性较小，但存在您从一个隔离节点读取的风险，而其他两个节点已重新建立相互通信并开始接受新的写入。 Of the three databases, Cassandra has the most flexibility for specifying this behavior per-query, while CockroachDB has the most reliable guarantee of consistency.在三个数据库中，Cassandra 最灵活地为每个查询指定此行为，而 CockroachDB 具有最可靠的一致性保证。