简体   繁体   English

Java字节数组

[英]Java set of byte arrays

I have a HashSet of byte[] s and I would like to test whether a new byte[] is in that set. 我有一个byte[]的HashSet,我想测试一个新的byte[]是否在该集合中。 The problem is that Java seems to be testing whether the byte[] instances are the same rather than testing whether the actual values in the byte arrays are the same. 问题是Java似乎在测试byte[]实例是否相同,而不是测试字节数组中的实际值是否相同。

In other words, consider the following code: 换句话说,请考虑以下代码:

public class Test
{
    public static void main(String[] args)
    {
        java.util.HashSet<byte[]> set=new java.util.HashSet<byte[]>();
        set.add(new String("abc").getBytes());
        System.out.println(set.contains(new String("abc").getBytes()));
    }
}

This code prints out false and I would like it to print out true . 此代码打印出false ,我希望打印出true How should I go about doing this? 我该怎么做呢?

You can wrap each byte array using ByteBuffer.wrap , which will provide the right equals and hashCode behavior for you. 您可以使用ByteBuffer.wrap包装每个字节数组,这将为您提供正确的equals和hashCode行为。 Just be careful what methods you call on the ByteBuffer (that you don't modify the array or advance its pointer). 要小心你在ByteBuffer上调用的方法(你不修改数组或推进它的指针)。

You could create a ByteArray class that wraps the byte arrays and tests for equality the way you want. 您可以创建一个ByteArray类,它包装字节数组并按您希望的方式测试相等性。 Then you'd have a Set<ByteArray> . 然后你有一个Set<ByteArray>

Modern ( as of right now solution ) 现代(截至目前的解决方案)

import com.google.common.collect.ImmutableSet;

import java.nio.ByteBuffer;
import java.util.Set;

import static com.google.common.base.Charsets.UTF_8;
import static java.nio.ByteBuffer.wrap;

public class Scratch
{
    public static void main(String[] args)
    {
        final Set<ByteBuffer> bbs = ImmutableSet.of(wrap("abc".getBytes(UTF_8)).asReadOnlyBuffer());
        System.out.println("bbs.contains(ByteBuffer.wrap(\"abc\".getBytes(Charsets.UTF_8))) = " + bbs.contains(wrap("abc".getBytes(UTF_8)).asReadOnlyBuffer()));
    }
}

NOTES: 笔记:

You should never convert a String to a byte[] without providing a Charset the results become runtime dependant based on the default Charset which is usually not a good one and can change. 永远不应该在没有提供Charset的情况下将String转换为byte[] ,结果将根据默认的Charset依赖于运行时,而Charset通常不是很好的并且可以更改。

.asReadOnlyBuffer() is important! .asReadOnlyBuffer()很重要!

Creates a new, read-only byte buffer that shares this buffer's content. 创建一个共享此缓冲区内容的新的只读字节缓冲区。 The content of the new buffer will be that of this buffer. 新缓冲区的内容将是此缓冲区的内容。 Changes to this buffer's content will be visible in the new buffer; 对此缓冲区内容的更改将在新缓冲区中可见; the new buffer itself, however, will be read-only and will not allow the shared content to be modified. 但是,新缓冲区本身将是只读的,不允许修改共享内容。

The two buffers' position, limit, and mark values will be independent. 两个缓冲区的位置,限制和标记值将是独立的。

The new buffer's capacity, limit, position, and mark values will be identical to those of this buffer. 新缓冲区的容量,限制,位置和标记值将与此缓冲区的容量,限制,位置和标记值相同。 If this buffer is itself read-only then this method behaves in exactly the same way as the duplicate method. 如果此缓冲区本身是只读的,则此方法的行为方式与复制方法完全相同。

您可以定义自己的包装类,但最简单的方法是将数组“包装”到ArrayLists中并使用HashSet<ArrayList>

You can avoid wrappers and the stupid hashCode problem (hey, a standard thing like a byte[] doesn't have hashCode right?): 你可以避免包装器和愚蠢的hashCode问题(嘿,像byte []这样的标准事物没有hashCode吗?):

Use TreeSet instead of HashSet and provide a byte[] comparator at instantiation time: 使用TreeSet而不是HashSet并在实例化时提供byte []比较器:

  Set<byte[]> byteATreeSet = new TreeSet<byte[]>(new Comparator<byte[]>() {
    public int compare(byte[] left, byte[] right) {
    for (int i = 0, j = 0; i < left.length && j < right.length; i++, j++) {
        int a = (left[i] & 0xff);
        int b = (right[j] & 0xff);
        if (a != b) {
            return a - b;
        }
    }
    return left.length - right.length;
   }});

If you get a byte[] HashSet b from somewhere else, initialize your variable a before as TreeSet and then use a.addAll(b); 如果从其他地方获得byte [] HashSet b,则先将变量a初始化为TreeSet,然后使用a.addAll(b); This way, even if b contained duplicates, a does not. 这样,即使b包含重复项,a也不包含重复项。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM