[英]Java BitSet wrong conversion from/to byte array
Working with BitSets I have a failing test: 使用BitSet,我的测试失败:
BitSet bitSet = new BitSet();
bitSet.set(1);
bitSet.set(100);
logger.info("BitSet: " + BitSetHelper.toString(bitSet));
BitSet fromByteArray = BitSetHelper.fromByteArray(bitSet.toByteArray());
logger.info("fromByteArray: " + BitSetHelper.toString(bitSet));
Assert.assertEquals(2, fromByteArray.cardinality());
Assert.assertTrue(fromByteArray.get(1)); <--Assertion fail!!!
Assert.assertTrue(fromByteArray.get(100)); <--Assertion fail!!!
To be more weird I can see my String representation of both BitSets: 更奇怪的是,我可以看到两个BitSet的String表示形式:
17:34:39.194 [main] INFO c.i.uniques.helper.BitSetHelperTest - BitSet: 00000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000
17:34:39.220 [main] INFO c.i.uniques.helper.BitSetHelperTest - fromByteArray: 00000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000
Are equals! 等于! What's happening here??
这里发生了什么事??
The used methods on this example are: 此示例上使用的方法是:
public static BitSet fromByteArray(byte[] bytes) {
BitSet bits = new BitSet();
for (int i = 0; i < bytes.length * 8; i++) {
if ((bytes[bytes.length - i / 8 - 1] & (1 << (i % 8))) > 0) {
bits.set(i);
}
}
return bits;
}
And the method used to get the String representation: 以及用于获取String表示形式的方法:
public static String toString(BitSet bitSet) {
StringBuffer buffer = new StringBuffer();
for (byte b : bitSet.toByteArray()) {
buffer.append(String.format("%8s", Integer.toBinaryString(b & 0xFF)).replace(' ', '0'));
}
return buffer.toString();
}
Some one could explain what's going on here? 有人可以解释这里发生了什么吗?
Note that BitSet
has a valueOf(byte[])
that already does this for you. 请注意,
BitSet
有一个valueOf(byte[])
已经为您完成此操作。
Inside your fromByteArray
method 在您的
fromByteArray
方法内部
for (int i = 0; i < bytes.length * 8; i++) {
if ((bytes[bytes.length - i / 8 - 1] & (1 << (i % 8))) > 0) {
bits.set(i);
}
}
you're traversing your byte[]
in reverse. 您正在反向遍历
byte[]
。 On the first iteration, 在第一次迭代中,
bytes.length - i / 8 - 1
will evaluate to 将评估为
8 - (0 / 8) - 1
which is 7
, which will access the most significant byte. 这是
7
,它将访问最高有效字节。 This is the one containing the 100th bit from your original bitset. 这是包含原始位集中的第100个位的位。 Viewed from the reverse side, this is the fourth bit.
从背面看,这是第四位。 And if you check the bits set in your generated
BitSet
, you'll notice the 5th and 98th (there might be an off by one bug here) bits are set. 并且,如果您检查生成的
BitSet
设置的位,则会注意到第5位和第98位(这里可能是一个错误关闭)而已设置。
But the byte[]
returned by toByteArray()
contains 但是
toByteArray()
返回的byte[]
包含
a little-endian representation of all the bits in this bit set
该位集中所有位的小端序表示
You need to read the byte[]
in the appropriate order 您需要以适当的顺序读取
byte[]
for (int i = 0; i < bytes.length * 8; i++) {
if ((bytes[i / 8] & (1 << (i % 8))) > 0) {
bits.set(i);
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.