简体   繁体   English

c ++哈希 <string> 有没有办法在linux和windows中获得相同的价值

[英]c++ hash<string> is there a way to get the same value in linux and windows

I try to find a way to get the same result when I hash a given string in windows and in linux. 当我在windows和linux中散列给定字符串时,我试图找到一种方法来获得相同的结果。 but for example if I run the following code: 但是,例如,如果我运行以下代码:

hash<string> h;
cout << h("hello");

it will return 3305111549 in windows and 2762169579135187400 in linux. 它将在Windows中返回3305111549,在linux中返回2762169579135187400。

If it is not possible to get the same return value accross these 2 platforms, is there any other decent hash function that would return the same value on both systems? 如果不可能在这两个平台上获得相同的返回值,那么是否还有其他类似的哈希函数会在两个系统上返回相同的值?

No. As per std::hash reference, emphasis mine: 不。根据std :: hash参考,强调我的:

The actual hash functions are implementation-dependent and are not required to fulfill any other quality criteria except those specified above. 实际的散列函数是依赖于实现的,除了上面指定的那些之外,不需要满足任何其他质量标准。

More specifically you are using the std::hash<std::string> template specialization whose hashes: 更具体地说,您使用的是散列std :: hash <std :: string>模板特化:

equal the hashes of corresponding std::basic_string_view classes 等于相应的std :: basic_string_view类的哈希值

which are also implementation dependent. 这也是依赖于实现的。 So no, you can not expect the same std::hash results with different implementations. 所以不,你不能指望不同的实现具有相同的std::hash结果。 Furthermore since C++14: 此外,自C ++ 14:

Hash functions are only required to produce the same result for the same input within a single execution of a program; 哈希函数只需要在程序的单次执行中为相同的输入产生相同的结果;

Not only you cannot depend on hash values among different platforms, but the standard doesn't guarantee that the hash value will be the same among different runs of the same program. 不仅你不能依赖不同平台之间的哈希值,而且标准不保证哈希值在同一程序的不同运行中是相同的。 It only guarantees that the value will be the same during the same run. 它只保证在同一次运行期间值相同。

This is the only requirement the C++14 standard poses for the returned value (beside that it's type should be std::size_t ) (17.6.3.4): 这是C ++ 14标准对返回值的唯一要求(除了它的类型应该是std::size_t )(17.6.3.4):

The value returned shall depend only on the argument k for the duration of the program. 返回的值仅取决于程序持续时间的参数k [ Note: Thus all evaluations of the expression h(k) with the same value for k yield the same result for a given execution of the program. [注意:因此,对于给定的程序执行,对具有相同k值的表达式h(k)所有求值产生相同的结果。 — end note ] [ Note: For two different values t1 and t2, the probability that h(t1) and > h(t2) compare equal should be very small, approaching 1.0 / numeric_limits::max(). - 结束注释] [注意:对于两个不同的值t1和t2,h(t1)和> h(t2)比较的概率应该非常小,接近1.0 / numeric_limits :: max()。 — end note ] - 结束说明]

(where h is a hash functor, k is the key) (其中h是散列函子, k是关键)

If you want to have the same value, then use a well-known hash algorithm, like MurmurHash3 . 如果你想拥有相同的值,那么使用一个众所周知的哈希算法,比如MurmurHash3

It won't work with std::hash : 它不适用于std::hash

The actual hash functions are implementation-dependent and are not required to fulfill any other quality criteria except those specified above. 实际的散列函数是依赖于实现的,除了上面指定的那些之外,不需要满足任何其他质量标准。 Notably, some implementations use trivial (identity) hash functions which map an integer to itself. 值得注意的是,一些实现使用简单(标识)散列函数将整数映射到自身。 In other words, these hash functions are designed to work with unordered associative containers, but not as cryptographic hashes, for example. 换句话说,这些散列函数被设计为与无序关联容器一起使用,但不是作为加密散列。

http://en.cppreference.com/w/cpp/utility/hash http://en.cppreference.com/w/cpp/utility/hash

I try to find a way to get the same result when I hash a given string in windows and in linux. 当我在windows和linux中散列给定字符串时,我试图找到一种方法来获得相同的结果。 but for example if I run the following code: 但是,例如,如果我运行以下代码:

 hash<string> h; cout << h("hello"); 

it will return 3305111549 in windows and 2762169579135187400 in linux. 它将在Windows中返回3305111549,在linux中返回2762169579135187400。

The results are correct. 结果是正确的。 As mentioned in other answers, the C++ standard doesn't even guarantee that the values will be the same between various execution of the same program. 正如其他答案中所提到的,C ++标准甚至不保证在同一程序的各种执行之间的值是相同的。

If it is not possible to get the same return value accross these 2 platforms, is there any other decent hash function that would return the same value on both systems? 如果不可能在这两个平台上获得相同的返回值,那么是否还有其他类似的哈希函数会在两个系统上返回相同的值?

Yes! 是! . You may want to check out Best hashing algorithms for speed and uniqueness for a list of good hash functions to implement. 您可能希望查看Best散列算法的速度和唯一性 ,以便实现好的散列函数列表。

However, after you select the one you want to use, you need one more extra guarantee: that the underlaying representations of characters are the same between the two platforms. 但是,在选择要使用的那个之后,还需要一个额外的保证:两个平台之间字符的底层表示是相同的。 That is that the numerical representations of 'a' in platform 1 is same as 'a' in platform 2. If one platform uses ASCII and the other uses a different encoding scheme, you aren't likely to get the same results. 也就是说,平台1中'a'的数字表示与平台2中的'a'相同。如果一个平台使用ASCII而另一个平台使用不同的编码方案,则不太可能得到相同的结果。


Again, std::hash<> already has a specialization for std::hash<std::string> . 再次, std::hash<>已经具有用于专业化 std::hash<std::string> So, other than your standard library's provision, there's nothing you can do about enforcing a behavior for the result of std::hash<std::string>()("hello") . 因此,除了标准库的规定之外,对于强制执行std::hash<std::string>()("hello")结果的行为,您无能为力。 Your option is to use: 您的选择是使用:

  • a custom hash function-object, eg myNAMESPACE::hash<std::string>()("hello") , or 自定义散列函数对象,例如myNAMESPACE::hash<std::string>()("hello") ,或者
  • use a custom string type, and specialize it for std::hash ; 使用自定义字符串类型,并将其专门用于std::hash ; eg std::hash<MyString>()("hello") 例如std::hash<MyString>()("hello")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM