简体   繁体   English

带复制的分片

[英]Sharding with replication

Sharding with replication ] 1带复制的分片] 1

I have a multi tenant database with 3 tables(store,products,purchases) in 5 server nodes .Suppose I've 3 stores in my store table and I am going to shard it with storeId .我有一个多租户数据库,在 5 个服务器节点中有 3 个表(商店、产品、购买)。假设我的商店表中有 3 个商店,我将使用storeId对其进行分片。 I need all data for all shards(1,2,3) available in nodes 1 and 2. But node 3 would contain only shard for store #1 , node 4 would contain only shard for store #2 and node 5 for shard #3.我需要节点 1 和 2 中可用的所有分片 (1,2,3) 的所有数据。但节点 3 将仅包含存储 #1 的分片,节点 4 将仅包含存储 #2 的分片和节点 5 的分片 #3 . It is like a sharding with 3 replicas.这就像一个具有 3 个副本的分片。 Is this possible at all?这可能吗? What database engines can be used for this purpose(preferably sql dbs)?可以为此目的使用哪些数据库引擎(最好是 sql dbs)? Did you have any experience?你有什么经验吗?

Regards问候

I have a feeling you have not adequately explained why you are trying this strange topology.我感觉您没有充分解释为什么要尝试这种奇怪的拓扑。

Anyway, I will point out several things relating to MySQL/MariaDB.无论如何,我会指出一些与 MySQL/MariaDB 有关的事情。

  • A Galera cluster already embodies multiple nodes (minimum of 3), but does not directly support "sharding". Galera 集群已经包含多个节点(至少 3 个),但不直接支持“分片”。 You can have multiple Galera clusters, one per "shard".您可以拥有多个 Galera 集群,每个“分片”一个。
  • As with my comment about Galera, other forms of MySQL/MariaDB can have replication between nodes of each shard.正如我对 Galera 的评论一样,其他形式的 MySQL/MariaDB 可以在每个分片的节点之间进行复制。
  • If you are thinking of having a server with all data, but replicate only parts to readonly Replicas, there are settings for replicate_do/ignore_database.如果您正在考虑拥有一个包含所有数据的服务器,但仅将部分复制到只读副本,则可以使用 replicate_do/ignore_database 的设置。 I emphasize "readonly" because changes to these pseudo-shards cannot easily be sent back to the Primary server.我强调“只读”,因为对这些伪分片的更改不能轻易发送回主服务器。 (However see "multi-source replication") (但请参阅“多源复制”)
  • Sharding is used primarily when there is simply too much traffic to handle on a single server.分片主要用于在单个服务器上处理太多流量的情况。 Are you saying that the 3 tenants cannot coexist because of excessive writes?你是说3个租户因为写入过多不能共存? (Excessive reads can be handled by replication.) (过多的读取可以通过复制来处理。)

A tentative solution:一个暂定的解决方案:

Have all data on all servers.拥有所有服务器上的所有数据。 Use the same Galera cluster for all nodes.对所有节点使用相同的 Galera 集群。

Advantage: When "most" or all of the network is working all data is quickly replicated bidirectionally.优势:当“大部分”或所有网络都在工作时,所有数据都可以快速双向复制。

Potential disadvantage: If half or more of the nodes go down, you have to manually step in to get the cluster going again.潜在的缺点:如果一半或更多的节点出现故障,您必须手动介入以使集群再次运行。

Likely solution for the 'disadvantage': "Weight" the nodes differently. “缺点”的可能解决方案:以不同的方式“加权”节点。 Give a height weight to the 3 in HQ;给总部 3 号的身高权重; give a much smaller (but non-zero) weight to each branch node.给每个分支节点一个小得多(但非零)的权重。 That way, most of the branches could go offline without losing the system as a whole.这样,大多数分支机构都可以离线而不会丢失整个系统。

But... I fear that an offline branch node will automatically become readonly.但是......我担心离线分支节点会自动变为只读。

Another plan:另一个计划:

Switch to NDB.切换到 NDB。 The network is allowed to be fragile.网络是脆弱的。 Consistency is maintained by "eventual consistency" instead of the "[virtually] synchronous replication" of Galera+InnoDB.一致性是通过“最终一致性”而不是 Galera+InnoDB 的“[虚拟]同步复制”来维护的。

NDB allows you to immediately write on any node. NDB 允许您立即在任何节点上写入。 Then the write is sent to the other nodes.然后将写入发送到其他节点。 If there is a conflict one of the values is declared the "winner".如果存在冲突,则将其中一个值声明为“赢家”。 You choose which algorithm for determining the winner.您选择确定获胜者的算法。 An easy-to-understand one is "whichever write was 'first'".一个易于理解的是“无论哪个写是'第一个'”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM