简体   繁体   English

将10位数的char用户ID分配给1000个服务器中的1个

[英]Assign 10-digit char user ids to 1 of 1000 servers

Looking to shard a database and to assign different users to different home servers based on their user id. 寻找分片数据库,并根据他们的用户ID将不同的用户分配给不同的家庭服务器。 User IDs are 10 character strings, eg, "f4gKUKkj91" ... each server has an ID of 1 - 1000. How can I create a hash function in php to uniquely and consistently assign each user id to a specific shard ? 用户ID是10个字符串,例如“ f4gKUKkj91” ...每个服务器的ID为1-1000。如何在php中创建哈希函数以唯一且一致地将每个用户ID分配给特定的分片? If the user id were an integer I could do userid % 1000 ... but since they are alphanumeric I'm not sure how to do this with even distribution in php. 如果用户ID是整数,则可以执行userid % 1000 ...,但是由于它们是字母数字,所以我不确定如何在php中均匀分配。

Thank you! 谢谢!

您可以使用crc32()为您提供字母数字用户ID的数字哈希。

This is not a perfect algorithm, as there will be a slight preference for smaller ID numbers. 这不是一个完美的算法,因为对于较小的ID号会有一点偏爱。 It assumes the user IDs are spread out fairly evenly, so to speak; 可以这么说,它假设用户ID分布均匀。 if they're not, the distribution may not be good. 如果不是,则分布可能不好。

Figure out what your alphabet is and put it in a string like $str = '0123456789abcdefghijklmnopqrstuvwxxyzABCDEFGHIJKLMNOPQRSTUVXYZ'; 弄清楚您的字母是什么,并将其放在$str = '0123456789abcdefghijklmnopqrstuvwxxyzABCDEFGHIJKLMNOPQRSTUVXYZ';这样的字符串中$str = '0123456789abcdefghijklmnopqrstuvwxxyzABCDEFGHIJKLMNOPQRSTUVXYZ'; This string has n characters. 该字符串包含n个字符。 Now, we will essentially treat the user ID as a base n integer. 现在,我们基本上将用户ID视为基数n整数。

For each character, find its index in the string (0-based). 对于每个字符,在字符串(从0开始)中找到其索引。 Take this index and multiply it with n x , where x is the character position in your original string, starting with 0. Add all of these together, and take the modulo of the sum. 取该索引并乘以n x ,其中x是原始字符串中字符的位置,从0开始。将所有这些加在一起,然后求和。

You probably only want to do this for a few characters - once you've read a few characters, the sum becomes quite big, and PHP can't handle it properly unless you resort to using functions suitable for large integer math (you can certainly use GMP and such, but it may not be ideal for your case). 您可能只想对几个字符执行此操作-阅读了几个字符后,总和就变得很大,PHP不能正确处理它,除非您诉诸使用适合于大整数数学的函数(当然可以使用GMP之类的方法,但对于您的情况可能并不理想)。 If you are using native integers, stop before the maximum possible sum goes beyond 2^31 (n x +n x+1 +...+n). 如果使用本机整数,请在最大可能和超过2 ^ 31(n x + n x + 1 + ... + n)之前停止。

You can use either start from the beginning or going backwards (going backwards corresponds to usual integer notation). 您可以使用从头开始或后退(后退对应于通常的整数符号)。 One of them may be more suitable, depending on how the ID generation works. 其中一种可能更适合,具体取决于ID生成的工作方式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM