简体   繁体   English

从 Java 服务到 Java 服务通过 RestTemplate 将文件作为字节 [] 发送时,是否需要额外的 base64 编码?

[英]Is additional base64 encoding necessary when sending files as byte[] from Java service to Java Service via RestTemplate?

I am sending data via json body in a post request from a client (Java) to a server (Java) using a Spring RestTemplate and RestController.我正在使用 Spring RestTemplate 和 RestController 从客户端 (Java) 到服务器 (Java) 的发布请求中通过 json 正文发送数据。

The data is present as a POJO on the client and will be parsed into a POJO with the same structure on the server.数据在客户端以 POJO 的形式存在,并将在服务器上解析为具有相同结构的 POJO。

On the client I am converting a file with Files.readAllBytes to byte[] and store it in the content field.在客户端上,我将带有 Files.readAllBytes 的文件转换为 byte[] 并将其存储在内容字段中。

On the server side the whole object including the byte[] will be marshalled to XML using JAXB annotations.在服务器端,包括 byte[] 在内的整个 object 将使用 JAXB 注释编组到 XML。

class BinaryObject {
  String fileName;
  String mimeCode;
  byte[] content;
}

Everything is working fine and running as intended.一切正常并按预期运行。 I heard it could be beneficial to encode the content field before transmitting the date to the server and decode it there before it is marshaled into XML.我听说在将日期传输到服务器之前对内容字段进行编码并在将其编组为 XML 之前对其进行解码可能是有益的。

My Question我的问题

Is it necessary or recommended to additionally encode / decode the content field with base64?是否有必要或建议使用 base64 对内容字段进行额外编码/解码?

TL;DR TL;博士

To the best of my knowledge, you are not going against any good practice with your current implementation.据我所知,您当前的实施并没有违反任何好的做法。 One might question the design (exchanging files in JSON? Storing binary inside XML?), but this is a separate question.有人可能会质疑设计(在 JSON 中交换文件?在 XML 中存储二进制文件?),但这是一个单独的问题。

Still, there is room for possible optmization, but the toolset you use (eg Spring rest template + Spring Controler + JSON serialization (jackson) + XML using JAXB) kind of hide the possible optimizations from you. Still, there is room for possible optmization, but the toolset you use (eg Spring rest template + Spring Controler + JSON serialization (jackson) + XML using JAXB) kind of hide the possible optimizations from you.

You have to carrefully weight the pros and cons of working around your comfortable "automat(g)ical" serializations that work well as of today to see if it is worth the trouble to tweak it.您必须仔细权衡围绕您舒适的“自动(g)ical”序列化工作的利弊,这些序列化在今天运行良好,看看是否值得麻烦调整它。

We can nonetheless discuss the theory of what could be done.我们仍然可以讨论可以做什么的理论。

A discussion about Base64关于Base64的讨论

Base64 encoding in an efficient way to encode binary data in pure text formats (eg MIME strucutres such as email or some HTTP bodies, JSON, XML, ...) but it has two costs: the first is a non negligible size increase (~ 33% size), the second is CPU time. Base64 encoding in an efficient way to encode binary data in pure text formats (eg MIME strucutres such as email or some HTTP bodies, JSON, XML, ...) but it has two costs: the first is a non negligible size increase (~ 33% 大小),其次是 CPU 时间。

Sometimes, (but you'd have to profile, check if that is your case), this cost is not negligible, esp.有时,(但你必须分析,检查是否是你的情况),这个成本是不可忽略的,尤其是。 for large files (due to some buffering and char/byte conversions in the frameworks, you could easilly end up using eg 4x the size of the encoded file in the Java Heap).对于大文件(由于框架中的一些缓冲和字符/字节转换,您最终可能会使用 Java 堆中编码文件大小的 4 倍)。

When handling 10kb files at 10 requests/sec, this is usually NOT an issue.当以 10 个请求/秒的速度处理 10kb 文件时,这通常不是问题。 But 10MB files at 100 req/second, well that is another ball park.但是 10MB 文件以 100 个请求/秒的速度,那是另一个球场。

So you'd have to check (I doubt your typical server will reach 100 req/s with 10MB files, because that is a 1GB/s incoming network bandwidth).所以你必须检查一下(我怀疑你的典型服务器在 10MB 文件时会达到 100 req/s,因为这是 1GB/s 的传入网络带宽)。

What is optimizable in your current process当前流程中可优化的内容

In your current process, you have multiple encodings taking place: the client needs to Base64 encode the bytes read from the file.在您当前的过程中,您进行了多种编码:客户端需要对从文件读取的字节进行 Base64 编码。

When the request hits the server, the server decodes the base64 to a byte[] , then your XML serialization (JAXB) reconverts the byte[] to base64.当请求到达服务器时,服务器将 base64 解码为byte[] ,然后您的 XML 序列化(JAXB)将byte[]重新转换为 Z95A1446A7120E4AF5DC0C87E87

So in effect, "you" (more exactly, the REST controler side of things) decoded base64 content, all for nothing because the XML side of things could have used it directly.所以实际上,“你”(更确切地说,是 REST 控制器方面)解码了 base64 内容,因为 XML 方面可以直接使用它。

What could be done可以做什么

A few things.一些东西。

Do you need base64 at the calling site?调用现场需要base64吗?

First, you do not have to encode at the client side.首先,您不必在客户端进行编码。 When using JSON, there is no choice, but the world did not wait for JSON to exchange files (eg arbitrary binary content) over HTTP.使用JSON时,别无选择,但世界并没有等到JSON通过HTTP交换文件(例如任意二进制内容)。

If your content is a file name, a MIME type, and a file body, then standard, direct HTTP calls with no JSON at all is perfectly fine.如果您的内容是文件名、MIME 类型和文件正文,那么完全没有 JSON 的标准直接 HTTP 调用是完全可以的。

The MIME type could be mapped to the Content-Type HTTP Header, the file name inside the Content-Disposition HTTP header, and the contents as the raw HTTP body. The MIME type could be mapped to the Content-Type HTTP Header, the file name inside the Content-Disposition HTTP header, and the contents as the raw HTTP body. No base64 needed (but you need your server-side to accept raw HTTP content as is).不需要 base64(但您需要服务器端按原样接受原始 HTTP 内容)。 This is standard as can be.这是标准的。

This change would allow you to remove the encoding (client side), lower the network size of the call (~33% less), and remove one decoding at the server side.此更改将允许您删除编码(客户端),降低调用的网络大小(减少约 33%),并删除服务器端的一个解码。 The server would just have to base64 encode (once) a raw stream to produce the XML, and you would not even need to buffer the whole file contents for that (you'd have to tweak you JAXB model a bit, but you can JAXB serialize directly bytes from an InputStream , which means, almost no buffer, and since your CPU probably encodes faster than your network serves content, no real latency incurred). The server would just have to base64 encode (once) a raw stream to produce the XML, and you would not even need to buffer the whole file contents for that (you'd have to tweak you JAXB model a bit, but you can JAXB直接从InputStream序列化字节,这意味着几乎没有缓冲区,并且由于您的 CPU 编码速度可能比网络提供内容的速度更快,因此不会产生真正的延迟)。

If this, for some reason, is not an option, let's say your client has to send JSON (and therefore base64 content)如果由于某种原因这不是一个选项,假设您的客户必须发送 JSON (因此 base64 内容)

Can you avoid decoding at the server side你能避免在服务器端解码吗

Sort of.有点。 You can use a server-side bean where the content is actually a String and NOT a byte[] .您可以使用content实际上是String而不是byte[]的服务器端 bean。 This is hacky, but your REST controler will no longer deserialize base64, it will keep it "as is", which is a JSON string (which happens to be base64 encoded content, but the controler does not care). This is hacky, but your REST controler will no longer deserialize base64, it will keep it "as is", which is a JSON string (which happens to be base64 encoded content, but the controler does not care).

So your server will have saved the CPU cost of one base64 decoding, but in exchange, you'll have a base64 String in java heap (compared to the raw byte[] , +33% size on Java >=9 with compact strings, +166% size on Java < 9). So your server will have saved the CPU cost of one base64 decoding, but in exchange, you'll have a base64 String in java heap (compared to the raw byte[] , +33% size on Java >=9 with compact strings,在 Java < 9 上 +166% 大小。

If you are to profit from this, you also have to tweak your JAXB to see the base64 encoded String as a byte[] , which is not trivial as far as I can tell, unless you modify the JAXB object in such a way that it accepts a String instead of the byte[] which is kind of hacky (if your JAXB objects are generated from a XML schema, this might really become a pain to implement) If you are to profit from this, you also have to tweak your JAXB to see the base64 encoded String as a byte[] , which is not trivial as far as I can tell, unless you modify the JAXB object in such a way that it接受String而不是byte[]有点hacky(如果您的 JAXB 对象是从 XML 模式生成的,这可能真的很难实现)

All in all this is much harder - probably too much if you are not really hitting the wall, performance wise, on this particular issue.总而言之,这要困难得多——如果你在这个特定问题上没有真正碰壁,性能明智的话,可能就太难了。

A few other stuff其他一些东西

Are your files pure binary, or are they actually text?您的文件是纯二进制文件,还是实际上是文本文件? If there are text, you may benefit from using CDATA encoding on the XML side instead of base64?如果有文本,您可能会受益于在 XML 端使用CDATA编码而不是 base64?

Is your XML actually a SOAP call?您的 XML 实际上是 SOAP 电话吗? If so, and if the service supports MTOM, you could avoid base64 completely, but that is an altogether different subject.如果是这样,并且服务支持 MTOM,您可以完全避免 base64,但这是完全不同的主题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM