简体   繁体   English

Google Protobuf ByteString vs. Byte []

[英]Google Protobuf ByteString vs. Byte[]

I am working with google protobuf in Java. 我正在使用Java中的google protobuf。 I see that it is possible to serialize a protobuf message to String, byte[], ByteString, etc: (Source: https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/MessageLite ) 我看到可以将protobuf消息序列化为String,byte [],ByteString等:(来源: https//developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf / MessageLite

I don't know what a ByteString is. 我不知道ByteString是什么。 I got the following definition from the the protobuf API documentation (source: https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/ByteString ): "Immutable sequence of bytes. Substring is supported by sharing the reference to the immutable underlying bytes, as with String." 我从protobuf API文档中获得了以下定义(来源: https//developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/ByteString ):“不可变的字节序列。子串与String一样,通过共享对不可变底层字节的引用来支持。“

It is not clear to me how a ByteString is different from a String or byte[]. 我不清楚ByteString如何与String或byte []不同。 Can somebody please explain? 有人可以解释一下吗? Thanks. 谢谢。

You can think of ByteString as an immutable byte array. 您可以将ByteString视为不可变的字节数组。 That's pretty much it. 这就是它。 It's a byte[] which you can use in a protobuf. 它是一个byte[] ,你可以在protobuf中使用它。 Protobuf does not let you use Java arrays because they're mutable. Protobuf不允许您使用Java数组,因为它们是可变的。

ByteString exists because String is not suitable for representing arbitrary sequences of bytes. ByteString存在,因为String不适合表示任意字节序列。 String is specifically for character data. String专门用于字符数据。

The protobuf MessageLite Interface provides toByteArray() and toByteString() methods. protobuf MessageLite接口提供toByteArray()和toByteString()方法。 If ByteString is an immutable byte[], would the byte representation of a message represented by both ByteString and byte[] be the same? 如果ByteString是一个不可变的byte [],那么ByteString和byte []表示的消息的字节表示是否相同?

Sort of. 有点。 If you call toByteArray() you'll get the same value as if you were to call toByteString().toByteArray() . 如果你调用toByteArray()你将获得与调用toByteString().toByteArray()相同的值。 Compare the implementation of the two methods, in AbstractMessageLite : 比较AbstractMessageLite两种方法的实现:

public ByteString toByteString() {
  try {
    final ByteString.CodedBuilder out =
      ByteString.newCodedBuilder(getSerializedSize());
    writeTo(out.getCodedOutput());
    return out.build();
  } catch (IOException e) {
    throw new RuntimeException(
      "Serializing to a ByteString threw an IOException (should " +
      "never happen).", e);
  }
}

public byte[] toByteArray() {
  try {
    final byte[] result = new byte[getSerializedSize()];
    final CodedOutputStream output = CodedOutputStream.newInstance(result);
    writeTo(output);
    output.checkNoSpaceLeft();
    return result;
  } catch (IOException e) {
    throw new RuntimeException(
      "Serializing to a byte array threw an IOException " +
      "(should never happen).", e);
  }
}

A ByteString gives you the ability to perform more operations on the underlying data without having to copy the data into a new structure. ByteString使您能够对基础数据执行更多操作,而无需将数据复制到新结构中。 For instance, if you wanted to provide a subset of bytes in a byte[] to another method, you would need to supply it with a start index and an end index. 举例来说,如果你想提供的一个子集bytesbyte[]另一种方法,你需要一个开始索引和结束索引为它供给。 You can also concatenate ByteStrings without having to create a new data structure and manually copy the data. 您还可以连接ByteStrings而无需创建新的数据结构并手动复制数据。

However, with a ByteString you can give the method a subset of that data without the method knowing anything about the underlying storage. 但是,使用ByteString您可以为该方法提供该数据的子集,而无需了解底层存储的任何信息。 Just like aa substring of a normal String. 就像普通String的子串一样。

A String is for representing text and is not a good way to store binary data (as not all binary data has a textual equivalent unless you encode it in a manner that does: eg hex or Base64). String用于表示文本, 并不是存储二进制数据的好方法(因为并非所有二进制数据都具有文本等效项,除非您以这样的方式对其进行编码:例如hex或Base64)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM