简体繁体 English

MongoDB 分片 + 复制

[英]MongoDB Sharding + Replication

原文 2018-10-22 01:51:40 6 1 mongodb/ replication/ sharding

I am new to MongoDB and I am trying to understand how these two technologies work together:我是 MongoDB 的新手，我试图了解这两种技术如何协同工作：

When using replication for you database, you have a primary node and a bunch of secondaries.为您的数据库使用复制时，您有一个主节点和一堆辅助节点。 To ensure consistency, it's recommended for you to always read from the primary node, right?为确保一致性，建议您始终从主节点读取，对吗？

So when you use replication with sharding for exemple: You have 2 replicas r1 and r2 in different servers, the partition is made by an id from 1 to 250 and 2 shards, shard 1 with 1 - 125 and shard 2 with 126 - 250.因此，当您将复制与分片一起使用时：例如：您在不同的服务器中有 2 个副本 r1 和 r2，分区由 1 到 250 的 id 和 2 个分片组成，分片 1 为 1 - 125，分片 2 为 126 - 250。

Now my questions: When using partitioning with sharding it means now that every shard have its own primary node?现在我的问题是：当使用分片分区时，这意味着现在每个分片都有自己的主节点吗？ So when reading information from document with id 130 I have to first find out where the primary node from the shard 2 is located?那么当从id为130的文档中读取信息时，我必须首先找出分片2的主节点所在的位置？

For example: r1 have the primary node for 1-125 and a secondary for 126-250例如：r1 有 1-125 的主节点和 126-250 的辅助节点

r2 have the primary node for 126-250 and secodnary for 1-125 r2 具有 126-250 的主节点和 1-125 的次要节点

Is that correct?那是对的吗？

Every replica still keeps the full database information?每个副本仍然保留完整的数据库信息？

Best regards此致

1 个解决方案

When using replication for you database, you have a primary node and a bunch of secondaries.为您的数据库使用复制时，您有一个主节点和一堆辅助节点。 To ensure consistency, it's recommended for you to always read from the primary node, right?为确保一致性，建议您始终从主节点读取，对吗？

Answers is yes and no.答案是肯定的和否定的。 Yes is you normally read from primary node but if you read from secondary It is a bit latency but result is nearly the same reading from primary是的，你通常从主节点读取，但如果你从辅助节点读取它有点延迟，但结果几乎与从主节点读取相同

No is You no need to check where is primary node to read, just specify replicaset in connect string and forget about replicaset.不，您无需检查要读取的主节点在哪里，只需在连接字符串中指定副本集，而无需考虑副本集。 Just work with this just like single db就像使用单个数据库一样使用它

Now my questions: When using partitioning with sharding it means now that every shard have its own primary node?现在我的问题是：当使用分片分区时，这意味着现在每个分片都有自己的主节点吗？

Yes是的

So when reading information from document with id 130 I have to first find out where the primary node from the shard 2 is located?那么当从id为130的文档中读取信息时，我必须首先找出分片2的主节点所在的位置？

No, when connect to cluster you should connect via mongos https://docs.mongodb.com/manual/reference/program/mongos/ It will do everything for you from finding which shard contain your data, primary node ., etc. With mongos you work with cluster just like a single db.不，当连接到集群时，您应该通过 mongos https://docs.mongodb.com/manual/reference/program/mongos/连接它会为您做所有事情，从查找包含您的数据的分片、主节点等。 mongos 就像单个数据库一样使用集群。

The only thing you should care is about performance you should read and understand about shard collection and shard key https://docs.mongodb.com/manual/core/sharding-shard-key/您唯一应该关心的是性能，您应该阅读和了解有关分片收集和分片键的信息https://docs.mongodb.com/manual/core/sharding-shard-key/

For example: r1 have the primary node for 1-125 and a secondary for 126-250.例如：r1 有 1-125 的主节点和 126-250 的辅助节点。 r2 have the primary node for 126-250 and secodnary for 1-125. r2 具有 126-250 的主节点和 1-125 的次要节点。 Is that correct?那是对的吗？

-> Wrong, Data is separated by shard key, read above for detail. -> 错误，数据由分片键分隔，详见上文。 In this case If you use id (1 - 250) for shard key.在这种情况下，如果您使用 id (1 - 250) 作为分片键。

r1 will contain 1- 125 in both primary and secondary (secondary is backup for primary what primary has will be cloned to secondary) r1 将在主要和次要中包含 1-125（次要是主要的备份，主要的内容将被克隆到次要）
r2 will contain 126 - 250 in both primary and secondary too ( for detail r2 primary contain 126 - 250, r2 secondary contain 126 - 250 too. Secondary node is mirror of primary node) r2 将在主节点和辅助节点中也包含 126 - 250（详细信息 r2 主节点包含 126 - 250，r2 辅助节点也包含 126 - 250。辅助节点是主节点的镜像）

Every replica still keeps the full database information?每个副本仍然保留完整的数据库信息？

No, only primary shard contain full database information ( https://docs.mongodb.com/manual/core/sharded-cluster-shards/#primary-shard ) Every replica set contain a part of shard collection that defined by shard key.Shard collection is big table you want to separate on several machine to improve performance不，只有主分片包含完整的数据库信息（ https://docs.mongodb.com/manual/core/sharded-cluster-shards/#primary-shard ）每个副本集都包含由分片键定义的分片集合的一部分。分片集合是你想在多台机器上分离以提高性能的大表