简体   繁体   English

使用MD5哈希作为索引

[英]Using an MD5 Hash as an index

I am writing a MongoDB Collection that contains a specific set of data, and I want to run comparisons against that data by taking an MD5 (or maybe SHA256) hash of the data and basing comparisons off of that. 我正在编写一个包含一组特定数据的MongoDB集合,并且我想通过对数据进行MD5(或SHA256)哈希处理,然后进行比较来对该数据进行比较。

I was wondering if using a fixed-length character string of hex-numbers is the right way of doing this. 我想知道使用固定长度的十六进制字符串是否是正确的方法。 Is there a better datatype to use, such as a "blob" or even a 64bit long integer to hold the values? 是否有更好的数据类型可以使用,例如“ blob”或什至64位长的整数来保存值? (This may require me to use a hashing function that produces longs -- I don't know of one except maybe overriding the Java .hashCode() function with Eclispe?) (这可能需要我使用会产生long值的散列函数-除了可能用Eclispe覆盖Java .hashCode()函数外,我不知道其他任何函数。)

If there is a better way entirely, advise on best practice would be appreciated here! 如果完全有更好的方法,请在此处提供有关最佳做法的建议!

Storing MD5 Hashes in MongoDB 在MongoDB中存储MD5哈希

You have to use String or Binary (half the size) in case you decide to store a MD5 hash (see here ). 如果您决定存储MD5哈希(请参阅此处 ),则必须使用String或Binary(大小的一半)。

Best Hash Function 最佳哈希函数

This is tough to answer, since it highly depends on the kind of data in your collection. 这很难回答,因为它高度依赖于集合中的数据类型。 I personally think that MD5 hashes are a good way, but again it depends on the use-case. 我个人认为MD5散列是一种好方法,但又取决于用例。 In case you want to customize/optimize your hash, this post and this post might get you started. 如果您想自定义/优化哈希,则此文章和该文章可能会让您入门。 They cover some simple recipes on writing a custom hash function. 它们涵盖了编写自定义哈希函数的一些简单方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM