简体   繁体   中英

Is there a Java structure that allows .contains lookups for serializable objects without actually storing them?

I'm looking for some sort of structure that would allow me to make contains() - lookups but wouldn't store the original values in order to reduce the storage overhead.

The use case is filtering of events in a large stream. I can't possibly store all encountered values, but knowing that certain events occurred before is valuable.

Java Sets or HashMaps store the keys, thus producing way too much overhead to be a viable solution for huge volumes of data.

Storing the actual values is not essential to allow for such look-ups. One example of that would be a Trie that can be used to match a multitude of different strings but requires significantly less storage than the individual strings combined.s

If what you're after is a guarantee that a value has not yet been seen, a bloom filter may suit your needs.

Guava has an implementation, in that case:

https://github.com/google/guava/wiki/HashingExplained#bloomfilter

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM