简体   繁体   English

Java BitSet从/到字节数组的错误转换

[英]Java BitSet wrong conversion from/to byte array

Working with BitSets I have a failing test: 使用BitSet,我的测试失败:

BitSet bitSet = new BitSet();
bitSet.set(1);
bitSet.set(100);
logger.info("BitSet: " + BitSetHelper.toString(bitSet));
BitSet fromByteArray = BitSetHelper.fromByteArray(bitSet.toByteArray());
logger.info("fromByteArray: " + BitSetHelper.toString(bitSet));
Assert.assertEquals(2, fromByteArray.cardinality());
Assert.assertTrue(fromByteArray.get(1));  <--Assertion fail!!! 
Assert.assertTrue(fromByteArray.get(100)); <--Assertion fail!!!

To be more weird I can see my String representation of both BitSets: 更奇怪的是,我可以看到两个BitSet的String表示形式:

17:34:39.194 [main] INFO  c.i.uniques.helper.BitSetHelperTest - BitSet: 00000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000
17:34:39.220 [main] INFO  c.i.uniques.helper.BitSetHelperTest - fromByteArray: 00000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000

Are equals! 等于! What's happening here?? 这里发生了什么事??

The used methods on this example are: 此示例上使用的方法是:

public static BitSet fromByteArray(byte[] bytes) {
        BitSet bits = new BitSet();
        for (int i = 0; i < bytes.length * 8; i++) {
            if ((bytes[bytes.length - i / 8 - 1] & (1 << (i % 8))) > 0) {
                bits.set(i);
            }
        }
        return bits;
    }

And the method used to get the String representation: 以及用于获取String表示形式的方法:

public static String toString(BitSet bitSet) {
        StringBuffer buffer = new StringBuffer();
        for (byte b : bitSet.toByteArray()) {
            buffer.append(String.format("%8s", Integer.toBinaryString(b & 0xFF)).replace(' ', '0'));
        }
        return buffer.toString();
    }

Some one could explain what's going on here? 有人可以解释这里发生了什么吗?

Note that BitSet has a valueOf(byte[]) that already does this for you. 请注意, BitSet有一个valueOf(byte[])已经为您完成此操作。

Inside your fromByteArray method 在您的fromByteArray方法内部

for (int i = 0; i < bytes.length * 8; i++) {
    if ((bytes[bytes.length - i / 8 - 1] & (1 << (i % 8))) > 0) {
        bits.set(i);
    }
}

you're traversing your byte[] in reverse. 您正在反向遍历byte[] On the first iteration, 在第一次迭代中,

bytes.length - i / 8 - 1

will evaluate to 将评估为

8 - (0 / 8) - 1

which is 7 , which will access the most significant byte. 这是7 ,它将访问最高有效字节。 This is the one containing the 100th bit from your original bitset. 这是包含原始位集中的第100个位的位。 Viewed from the reverse side, this is the fourth bit. 从背面看,这是第四位。 And if you check the bits set in your generated BitSet , you'll notice the 5th and 98th (there might be an off by one bug here) bits are set. 并且,如果您检查生成的BitSet设置的位,则会注意到第5位和第98位(这里可能是一个错误关闭)而已设置。

But the byte[] returned by toByteArray() contains 但是toByteArray()返回的byte[]包含

a little-endian representation of all the bits in this bit set 该位集中所有位的小端序表示

You need to read the byte[] in the appropriate order 您需要以适当的顺序读取byte[]

for (int i = 0; i < bytes.length * 8; i++) {
    if ((bytes[i / 8] & (1 << (i % 8))) > 0) {
        bits.set(i);
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM