简体   繁体   中英

High performance string hashing function in Java/Scala

Looking for a high-performance String hashing functions in Java/Scala - something faster than functions from MurmurHash family, doesn't need to be cryptographically strong, only distribute well.

Any suggestions?

The fastest hashing algorithm that fits the bill presently seems to be xxHash . The lz4-java project contains an implementation ported to Java . I don't know whether the Java implementation has been benchmarked against MurmurHash, though; performance optimizations in C++ don't always port to/from Java. (In particular, xxHash contains more array access, so there could be non-negligible bounds-checking overhead.)

Edit: it looks to me like it uses JNI to call the C++ implementation of xxHash, but JNI overhead is non-negligible, so the performance concerns remain.

However, given that Scala includes a MurmurHash function , and that Java contains a faster default hash (about 2x) that is sorta-reasonably distributed sometimes, one does wonder whether it's really necessary. For instance, scala.util.hashing.MurmurHash3 is about as fast as string creation from an array of bytes, and is twice as fast as that if you give it an array of bytes.

你可以找到非常快速的Java哈希函数实现,BTW帐户内部的String实现( char[]数组)以最大化速度,这里: https//github.com/OpenHFT/Zero-Allocation-Hashing

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM