简体   繁体   English

番石榴 - 布隆过滤器:是否有可能为 ** 以前** 误报的值获得真负?

[英]guava - bloomfilter: is it possible to get a true-negative for a value that was **previously** a false-positive?

If I understand correctly, once an item is put inside a guava bloom filter, mightContain will always return true.如果我理解正确,一旦将项目put番石榴布隆过滤器中, mightContain将始终返回 true。 If the filter returns false on mightContain , then the value has never been put inside the filter.如果过滤器在mightContain上返回false ,则该值从未被放入过滤器中。 What I'm wondering is for the values that might be a false positive at a given moment, as more values are put in, the once false-positives might become true-negatives later on (if they are not put in, of course).我想知道的是,对于在给定时刻might是误报的值,随着输入的值越来越多,曾经的误报可能会在以后变成真阴性(当然,如果不输入它们) .

Something like this:像这样的东西:

GuavaBloomFilter<Integer> bf = new GuavaBloomFilter<>(blah, blah);
# if I start checking, none of the values should return tru at the monent
System.out.println(bd.mightContain(5)); // false
System.out.println(bd.mightContain(10)); // false
System.out.println(bd.mightContain(15)); // false
# fine
# let's put in a value now
bf.put(10);
System.out.println(bd.mightContain(5));
System.out.println(bd.mightContain(10)); // true, every time from now on
System.out.println(bd.mightContain(15));

On the last 3 checks, when checking for 10, it will always return true.在最后 3 次检查中,当检查 10 次时,它将始终返回 true。 For 5 and 15, it might return true.对于 5 和 15,它可能返回 true。 Suppose that for 5 we get false (never put inside), for 15 we get a false positive.假设对于 5 我们得到错误(从不放入里面),对于 15 我们得到一个误报。

So, we continue:所以,我们继续:

bf.put(5);
System.out.println(bd.mightContain(5)); // true, every single time from now on
System.out.println(bd.mightContain(10)); // true, every time from now on
System.out.println(bd.mightContain(15));

So.... now, when checking for 5, we will always get true.所以.... 现在,当检查 5 时,我们将always真。 Is it possible that because of the state change inside the bloom filter, the result for checking 15 which was previously a false-positive, might return a true-negative value?是否有可能因为布隆过滤器内部的状态变化,检查 15 之前是误报的结果可能会返回真负值?

For a true Bloom filter, the bits only ever go from 0 to 1, never back - so the result of a mightContain call can only ever go from false to true , never back, because mightContain returns true if a certain subset of all bits are 1, and once they're 1 they'll stay 1.对于真正的布隆过滤器,位只从 0 到 1,永远不会返回 - 所以mightContain调用的结果只能从falsetrue ,永远不会返回,因为如果所有位的某个子集是, mightContain返回 true 1,一旦它们是 1,它们就会保持 1。

Guava's implementation is indeed a true Bloom filter, since the BloomFilter.put method ( source ) delegates to Strategy.put ( source ), an interface implemented in BloomFilterStrategies ( source ). Guava 的实现确实是一个真正的 Bloom 过滤器,因为BloomFilter.put方法( )委托给Strategy.put),一个在BloomFilterStrategies)中实现的接口。 The Bloom filter's bits are stored in a LockFreeBitArray named bits , and the strategy only calls its bitSize , set and get methods.布隆过滤器的位存储在名为bitsLockFreeBitArray ,该策略仅调用其bitSizesetget方法。 Of those, only set changes bits ( source ), and it only uses the bitwise 'or' operator |其中,仅set更改位( source ),并且仅使用按位“或”运算符| to change them.改变它们。 This can never change a 1 back to a 0.这永远无法将 1 改回 0。

So, it is indeed impossible for a value which was previously a false-positive to later become a true-negative.所以,以前是假阳性的值后来变成真阴性确实是不可能的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM