简体   繁体   English

分片MySQL数据库的最佳方法

[英]Best way to shard Mysql database

I have a huge number of users so I am needed to shard the databases in n shards. 我有大量的用户,因此需要将数据库分片为n个分片。 So to proceed with this I have below options- 因此,要进行此操作,我有以下选择-

  1. Divide my data in n shards basis userId modulus n operation. 将我的数据划分为n个分片,基于userId模数n操作。 ie if I have 10 shards userId 1999 will be sent to 1999%10=9th shard 即如果我有10个分片,则userId 1999将发送到1999%10 =第9个分片
    Problem- The problem with this approach is if the number of shard increases in future reference to previous will not be maintained. 问题 -这种方法的问题是,如果将来参考以前的分片数量增加,将无法维持。

  2. I can maintain a table with UserId and ShardId 我可以使用UserId和ShardId维护一个表
    Problem- If my users increase in future to billions I'll need this mapping table to be shared which doesn't seem to be good solution. 问题-如果我的用户将来增加到数十亿,我将需要共享此映射表,这似乎不是一个好的解决方案。

  3. I can maintain static mapping in code like 0-10000 in Shard 1 and more on. 我可以在碎片1等中的0-10000之类的代码中维护静态映射。
    Problem- 问题-

    • With the increase in shards and Users Code needed to be changed more often. 随着分片和用户代码的增加,需要更频繁地更改代码。
    • If any specific User in shard has huge data It'd get difficult to separate out the shard. 如果分片中的任何特定用户拥有大量数据,将很难分离出分片。

So, these are the three ways I could have found but all having some problem. 因此,这是我可以找到的三种方法,但是都存在一些问题。 What would be an alternate or better approach to shard the MySQL tables which can compensate with increased number of shards and users in future. 分片MySQL表的替代方法或更好的方法是,将来可以补偿增加的分片和用户数量。

I prefer a hybrid of 1 and 2: 我更喜欢1和2的混合体:

  1. Hash the UserId into, say, 4096 values. 将UserId散列为4096个值。
  2. Look up that number in a 'dictionary' that has shard numbers in it. 在其中包含分片编号的“字典”中查找该编号。

If a shard gets too full, migrate all the users with some hash number to another shard. 如果一个分片太满,则将具有某个哈希值的所有用户迁移到另一个分片。

If you add a shard, migrate a few hash numbers to it - preferable from busy shards. 如果添加了分片,请向其迁移一些哈希数-最好从繁忙的分片开始。

This forces you to write a script for moving users, and make it robust. 这迫使您编写用于移动用户的脚本,并使其健壮。 Once you have that, a lot of other admin tasks become 'simple': 一旦有了这些,许多其他管理任务就会变得“简单”:

  • Retire a machine 退机
  • Upgrade the OS (one by one across shards) 升级操作系统(跨碎片一一升级)
  • Upgrade whatever software is on the machines 升级计算机上的任何软件
  • Migrate a hash number that is bulky but not busy to a old, slow, shard that has a big disk. 将散列但不忙的哈希数迁移到磁盘较大的旧的,缓慢的碎片中。 Similarly migrate small and busy to a shard with more cores and faster disks. 同样,将忙碌的小型迁移到具有更多核心和更快磁盘的分片。

Each shard could be an HA cluster (Galera, Group replication, etc) of servers for both reliability and read-scaling. 每个分片可以是服务器的HA群集(Galera,组复制等),以实现可靠性和读取扩展。 (Sharding gives you write-scaling. (着色使您可以进行写缩放。

There would need to be a way to distribute the dictionary to all clients "promptly". 将需要一种将词典“立即”分发给所有客户端的方法。

All of this works well if you have, say, each hash in 3 different shards for HA. 例如,如果您为HA分配了3个不同的分片中的每个散列,则所有这些工作都很好。 Each of the 3 would be at geographic locations for robustness. 3个中的每个都将位于地理位置以提高鲁棒性。 The dictionary would have 4 columns to say where the copies are. 该词典将有4列来说明副本的位置。 The 4th would be used during migrations. 迁移期间将使用第4个。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM