简体   繁体   中英

Hashing n-grams by cyclic polynomials - java implementation

I'm solving some problem that involves Rabin–Karp string search algorithm. This algorithm requires rolling hash to be faster then naive search. This article describes how to implement rolling hash. I implemented "Rabin-Karp rolling hash" without problems and found few implementations implementations , but article also mentions computational complexity and that hashing n-grams by cyclic polynomials is prefered. It links to BuzHash implementation of such technique but I wonder how it can be used to build n-gram hash on top of it. I want to have something like this or

CPHash cp = new CPHash("efghijk");
cp.shiftRight('l') // now we got hash of "fghijki"
cp.shiftLeft('d') // "defghi"

for java.

For people who will encounter problems related with string search (like me) there are some articles that I found usefull: 1 , 2 , 3

I recently published an Apache licensed Java library which implements several rolling hash functions including Cyclic and Rabin-Karp:

http://code.google.com/p/rollinghashjava/

https://github.com/lemire/rollinghashjava

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM