简体繁体 English

Object 到 Java 中的字节数组

[英]Object to bytes array in Java

原文 2021-11-22 16:25:50 1 1 java/ serialization/ tcp/ bytebuffer

I'm working on a proprietary TCP protocol.我正在研究专有的 TCP 协议。 This protocol sends and receive messages with a specific sequence of bytes.该协议发送和接收具有特定字节序列的消息。 I'm working on a Java controller that use this protocol and I've thought to define the message structures in specific classes and serialize/deserialize them, but I was naive.我正在研究使用此协议的 Java controller，我曾想过在特定类中定义消息结构并序列化/反序列化它们，但我很天真。

First of all I tried ObjectOutputStream, but it output the entire structure of the object, when I need only the values in a specific order.首先我尝试了ObjectOutputStream，但它的output整个结构是object，当我只需要特定顺序的值时。

Someone already faced this problem: Java - Object to Fixed Byte Array有人已经遇到过这个问题： Java - Object to Fixed Byte Array
and solved it with a dedicated Marshaller.并用专门的 Marshaller 解决了这个问题。

But I was searching for a more flexible solution.但我正在寻找更灵活的解决方案。

For text serialization and deserialization I've found:对于文本序列化和反序列化，我发现：
http://jeyben.github.io/fixedformat4j/ http://jeyben.github.io/fixedformat4j/
that with annotation defines the schema of the line.带有注释的那个定义了该行的模式。 But it outputs a String, not a byte[] .但它输出一个字符串，而不是一个byte[] 。 So 1 is output like "1" that is represented differently based on encoding, and often with more bytes.所以 1 是 output 就像“1”一样，根据编码表示不同，并且通常有更多字节。

What I was searching for is something that given the order of my class properties will convert each property in a bunch of bytes (based on the internal representation) and append them to a byte[] .我正在寻找的是给定我的 class 属性的顺序的东西，它将把每个属性转换成一堆字节（基于内部表示）和 append 它们到一个byte[] 。

Do you know some library used for that purpose?你知道一些用于此目的的库吗？
Or a simple way to do that, without coding a serialization algorithm for each of my entities?或者一种简单的方法来做到这一点，而不为我的每个实体编写序列化算法？

1 个解决方案

Serialization just isn't easy;序列化并不容易； it sounds from your question like you feel you can just invoke something and out rolls compact, simple, versionable, universal data you can then put on the wire.从你的问题听起来，你觉得你可以调用一些东西，然后推出紧凑、简单、可版本化、通用的数据，然后你可以把它们放在网络上。 What you need to fix is to scratch the word 'just' from that sentence.您需要解决的是从该句子中删除“just”一词。 You're going to have to invest some time and care.你将不得不投入一些时间和精力。

As you figured out already, java's baked in serialization has a ton of downsides.正如您已经发现的那样，java 的序列化有很多缺点。 Don't use that.不要用那个。

There are various serializers.有各种序列化程序。 The popular ones are things like GSON or Jackson , which lets you serialize java objects into JSON.流行的是GSON或Jackson 之类的东西，它可以让您将 java 对象序列化为 Z0ECD11C1D7A2D7740. This isn't particularly efficient, and is string based.这不是特别有效，并且是基于字符串的。 This sounds like crucial downsides but they really aren't, see below.这听起来像是关键的缺点，但实际上并非如此，见下文。

You can also spend a little more time specifying the exact format and use protobuf which lets you write a quite lean and simple data protocol (and protobuf is available for many languages, if eventually you want to write an participant in this protocol in non-java later).您还可以花更多时间指定确切的格式并使用protobuf ，它可以让您编写一个非常精简和简单的数据协议（如果最终您想在非 java 中编写此协议的参与者，则 protobuf 可用于多种语言之后）。

So, those are the good options: Go to JSON via Jackson or GSON, or, use protobuf.所以，这些都是不错的选择：Go 到 JSON 通过 Jackson 或 ZB0AA3DCF4968BF4C701ADB 或 4 使用 96BB14CE7Z1ADB。

But JSON is a string.但是 JSON 是一个字符串。

You can turn a string to bytes trivially using str.getBytes(StandardCharsets.UTF_8) .您可以使用str.getBytes(StandardCharsets.UTF_8)轻松地将字符串转换为字节。 This cannot fail due to charset encoding differences (as long as you also 'decode' in the same fashion: Turn the bytes into a string with new String(theBytes, StandardCharsets.UTF_8) . UTF-8 is guaranteed to be available on all JVMs; if it is not there, your JVM is as broken as a JVM that is missing the String class - not something to worry about.由于字符集编码差异，这不会失败（只要您还以相同的方式“解码”：使用new String(theBytes, StandardCharsets.UTF_8)将字节转换为字符串。UTF-8 保证在所有 JVM 上可用；如果它不存在，您的 JVM 与缺少字符串 class 的 JVM 一样损坏 - 无需担心。

But JSON is inefficient.但是 JSON 效率低下。

Zip it up, of course. Zip 当然可以。 You can trivially wrap an InputStream and an OutputStream so that gzip compression is applied which is simple, available on just about every platform, and fast (it's not the most efficient cutting edge compression algorithm, but usually squeezing the last few bytes out is not worth it) - and zipped-up JSON can often be more efficient that carefully handrolled protobuf, even.您可以简单地包装一个 InputStream 和一个 OutputStream 以便应用 gzip 压缩，这很简单，几乎在每个平台上都可用，而且速度很快（它不是最有效的前沿压缩算法，但通常挤出最后几个字节是不值得的它）- 压缩的 JSON 通常甚至比仔细手动处理的 protobuf 更有效。

The one downside is that it's 'slow', but on modern hardware, note that the overhead of encrypting and decrypting this data (which you should obviously be doing.,) is usually multiple orders of magnitude more involved.一个缺点是它“慢”，但在现代硬件上，请注意加密和解密此数据的开销（您显然应该这样做。）通常涉及多个数量级。 A modern CPU is simply very, very fast - creating JSON and zipping it up is going to take 1% of CPU or less even if you are shipping the collected works of shakespeare every second.现代 CPU 非常非常快 - 创建 JSON 并将其压缩将占用 1% 或更少的 CPU，即使您每秒都在运送莎士比亚的作品集。

If an arduino running on batteries needs to process this data, go with uncompressed, unencrypted protobuf-based data.如果使用电池运行的 arduino 需要处理此数据，则 go 会使用未压缩、未加密的基于 protobuf 的数据。 If you are facebook and writing the whatsapp protocol, the IAAS creds saved by not having to unzip and decode JSON is tiny and pales in comparison to the creds you spend just running the servers, but at that scale its worth the development effort.如果您是 facebook 并编写 whatsapp 协议，则无需解压缩和解码 JSON 所节省的 IAAS 凭据与您仅用于运行服务器的凭据相比微不足道而且相形见绌，但在这种规模上，它值得开发工作。

In just about every other case, just toss gzipped JSON on the line.在几乎所有其他情况下，只需将 gzip 压缩的 JSON 扔就行了。