简体   繁体   English

Java获取BitSet交集基数的最快方法

[英]Java fastest way to get cardinality of BitSet intersection

The function below takes two BitSets , makes a copy of the first (it must not be overridden), intersects the copy with the second (bitwise AND) and returns the cardinality of the result. 下面的函数需要两个BitSets ,复制第一个(它不能被覆盖),将副本与第二个(按位AND)相交并返回结果的基数。

public int getIntersectionSize(BitSet bits1, BitSet bits2) {
    BitSet copy = (BitSet) bits1.clone();
    copy.and(bits2);
    return copy.cardinality();
}

I'm interested if this code can be sped up? 我对这段代码加速感兴趣吗? This function is called billion of times so even a microsecond speed up makes sense plus I'm curious about the fastest possible code. 这个功能被称为十亿次,所以即使是微秒加速也是有道理的,而且我对最快的代码感到好奇。

If you're going to use each BitSet several times, it could be worthwhile to create a long array corresponding to each BitSet . 如果您要多次使用每个BitSet ,那么创建一个对应于每个BitSetlong数组可能是值得的。 For each BitSet : 对于每个BitSet

long[] longs = bitset.toLongArray();

Then you can use the following method, which avoids the overhead of creating a cloned BitSet . 然后,您可以使用以下方法,这可以避免创建克隆BitSet的开销。 (This assumes that both arrays are the same length). (这假设两个数组的长度相同)。

int getIntersectionSize(long[] bits1, long[] bits2) {
    int nBits = 0;
    for (int i=0; i<bits1.length; i++)
        nBits += Long.bitCount(bits1[i] & bits2[i]);
    return nBits;
}

Here is an alternative version, but I'm not sure if it is really faster, depends on nextSetBit . 这是一个替代版本,但我不确定它是否真的更快,取决于nextSetBit

public int getIntersectionsSize(BitSet bits1, BitSet bits2) {
   int count = 0;
   int i = bits1.nextSetBit(0);
   int j = bits2.nextSetBit(0);
   while (i >= 0 && j >= 0) {
      if (i < j) {
         i = bits1.nextSetBit(i + 1);
      } else if (i > j) {
         j = bits2.nextSetBit(j + 1);
      } else {
         count++;
         i = bits1.nextSetBit(i + 1);
         j = bits2.nextSetBit(j + 1);
      }
   }
   return count;
}

The above is the readable version, hopefully good enough for the compiler, but you could optimize it manually I guess: 以上是可读版本,希望编译器足够好,但你可以手动优化它我猜:

public int getIntersectionsSize(BitSet bits1, BitSet bits2) {
   int count = 0;
   for (int i = bits1.nextSetBit(0), j = bits2.nextSetBit(0); i >= 0 && j >= 0; ) {
      while (i < j) {
         i = bits1.nextSetBit(i + 1);
         if (i < 0)
            return count;
      }
      if (i == j) {
         count++;
         i = bits1.nextSetBit(i + 1);
      }
      while (j < i) {
         j = bits2.nextSetBit(j + 1);
         if (j < 0)
            return count;
      }
      if (i == j) {
         count++;
         j = bits2.nextSetBit(j + 1);
      }
   }
   return count;
}

I've been looking for a solution to this recently and here's what I came up with: 我最近一直在寻找解决方案,这就是我想出来的:

int intersectionCardinality(final BitSet lhs, final BitSet rhs) {
    int lhsNext;
    int retVal = 0;
    int rhsNext = 0;

    while ((lhsNext = lhs.nextSetBit(rhsNext)) != -1 &&
            (rhsNext = rhs.nextSetBit(lhsNext)) != -1) {
        if (rhsNext == lhsNext) {
            retVal++;
            rhsNext++;
        }
    }

    return retVal;
}

Perhaps someone would like to take the time to compare the different solutions here and post the results... 也许有人想花时间在这里比较不同的解决方案并发布结果......

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM