简体   繁体   English

我怎样才能将字符串散列到特定数量的桶中

[英]how can i hash strings into a specific number of buckets

I'm trying to come up with an algorithm to hash a string into a specific number of buckets but haven't had any luck coming up with ideas on how to do this? 我正在尝试提出一种算法来将字符串哈希到特定数量的桶中,但是对于如何做到这一点却没有任何运气想法?

I have a list of strings like this: 我有一个像这样的字符串列表:

a.jpg A.JPG
b.htm b.htm
c.gif c.gif
d.jpg D.JPG
e.swf e.swf

and i would like to run a function to get a number between 1 and 4 based on the string. 我想运行一个函数来根据字符串得到1到4之间的数字。

egajpg would be 3 egajpg将是3
b.htm would be 2 b.htm将是2
c.gif would be 1 c.gif将是1
etc 等等

it needs to be consistent so if i run the function on a.jpg it always returns 3. 它需要保持一致,所以如果我在a.jpg上运行它总是返回3。

this algorithm would be for splitting resources between servers... 这个算法用于在服务器之间分割资源......

egajpg would be accessed from server3.mydomain.com 可以从server3.mydomain.com访问egajpg
b.htm would be accessed from server2.mydomain.com b.htm将从server2.mydomain.com访问
etc 等等

Does anyone know how I would go about doing this? 有谁知道我会怎么做呢?

Any advice would be much appreciated! 任何建议将不胜感激!

Cheers 干杯

Tim 蒂姆

You may find the following blog post useful. 您可能会发现以下博文有用。 The proposed algorithm is: 提出的算法是:

int bucketIndex = (int)((uint)"d.jpg".GetHashCode() % (uint)buckets.Length);
int bucket = (int)(unchecked(((uint)s.GetHashCode())) % 4 + 1)

(其中s是字符串)

Standard GetHashCode and % will work: Math.Abs("aaaa".GetHashCode()) % numberOfBuckets . 标准的GetHashCode和%将起作用: Math.Abs("aaaa".GetHashCode()) % numberOfBuckets

EDIT thanks Thomas Levesque for reminding of GetHashCode() returning < 0. Added Math.Abs to have correct code, but versions in other answers are likely work better. 编辑感谢Thomas Levesque提醒GetHashCode()返回<0。添加Math.Abs​​以获得正确的代码,但其他答案中的版本可能更好。

Use a hash algorithm based on a shared machine key. 使用基于共享计算机密钥的哈希算法。 This will create a unique identifier per string. 这将为每个字符串创建一个唯一标识符。 If you require integers then use a dictionary object to map strings to ints. 如果需要整数,则使用字典对象将字符串映射到整数。 Every time you add a new string set its key to the current dictionary length. 每次添加新字符串时,都将其键设置为当前字典长度。 Finally store the dictionary in a farm based state object such as a shared session so that each site instance can reference it. 最后,将字典存储在基于场的状态对象(如共享会话)中,以便每个站点实例都可以引用它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将字符串放入文化不变的桶中? - How to put strings into culture-invariant buckets? .net字典使用多少个哈希桶? - How many hash buckets does a .net Dictionary use? 如何获取字符串中特定单词之间的数字? - How can I get number between specific words in string? 如何限制.net应用程序执行特定次数 - how can I limit a .net application to be executed for specific number of times 如何在特定 position 的数组中创建新号码? - How can I create a new number in an Array in a Specific position? 如何在 c# 的字符串中查找特定字符串的数量 - How to find number of specific strings in a string in c# 如何可靠地测试/确定.Net HashSet的大小(包括空存储桶) <T> 宾语? - How can I reliably test/benchmark the size (including empty buckets) of a .Net HashSet<T> object? 如何读取和查找 txt 文件中的特定信息并将其与字符串统一进行比较 - How can I read and find specific information in a txt file and compare it to strings unity 如何创建一个包含数字数组和字符串数组的随机类 - How can I create a random class that includes an array of number and an array of strings 如何根据所说字符串中的数字重新组织字符串列表? - How can I reorganize a list of strings based on a number inside of said string?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM