简体   繁体   English

Base64编码/解码问题:解码后的字符串为“?”

[英]Issue with Base64 encoding/decoding: decoded string is '?'

I am trying to read an image and use Base64 encoding to convert it into byte array and then to string to send it over network. 我正在尝试读取图像,并使用Base64编码将其转换为字节数组,然后转换为字符串以通过网络发送。 The problem is that when I try to decode the Base64 encoded string, I am getting incorrect data. 问题是,当我尝试解码Base64编码的字符串时,我得到的数据不正确。

For eg. 例如。 I am facing issue with below special character. 我正面临以下特殊字符的问题。

I am using following code for encoding: 我正在使用以下代码进行编码:

byte[] b = Base64.encodeBase64(IOUtils.toByteArray(loInputStream));
String ab = new String(b);

IOUtils is org.apache.commons.io.IOUtils . IOUtilsorg.apache.commons.io.IOUtils

and loInput 和loInput

Code for decoding: 解码代码:

byte[] c = Base64.decodeBase64(ab.getBytes());
String ca = new String(c);
System.out.println(ca);

It prints ? 它打印? for decoded String. 用于解码的字符串。

Can anyone please let me know the issue. 谁能让我知道这个问题。

If your input is an image, it makes sense to encode it as base64 - base64 is text, and can be represented by a String. 如果您输入的是图像,则将其编码为base64是有意义的-base64是文本,并且可以用String表示。

Decoding it again though, you get the original image. 但是,再次对其进行解码,就可以得到原始图像。 An image is usually a binary format; 图像通常是二进制格式; it does not make sense to try to convert that to a string - it is not text. 尝试将其转换为字符串没有意义-它不是文本。

That is, the last 2 lines: 也就是说,最后两行:

   String ca = new String(c);
   System.out.println(ca);

Simply does not make sense to do. 根本没有意义。

If you want to check that the decoding produces the same output as the original input, do eg 如果要检查解码是否产生与原始输入相同的输出,请执行例如

  System.out.println("Original and decoded are the same: " + Arrays.equals(b,c));

(Or save the byte array to a file and view the image in an image viewer) (或将字节数组保存到文件并在图像查看器中查看图像)

As I've said elsewhere , in Java, String is for text, and byte[] is for binary data. 正如我在其他地方所说的,在Java中, String用于文本,而byte[]用于二进制数据。

String ≠ byte[] 字符串≠字节[]

Text ≠ Binary Data 文本≠二进制数据

An image is binary data. 图像是二进制数据。 Base64 is an encoding which allows transmission of binary data over US_ASCII compatible text channels (there is a similar encoding for supersets of ASCII text: Quoted Printable). Base64是一种编码,它允许通过US_ASCII兼容的文本通道传输二进制数据(对于ASCII文本的超集也有类似的编码:Quoted Printable)。

So, it goes like: 因此,它就像:

Image (binary data) → Image (text, Base64 encoded binary data) → Image (binary data)

where you would use String encodeBase64String(byte[]) to encode, and byte[] decode(String) to decode. 您将在其中使用String encodeBase64String(byte[])进行编码,并使用byte[] decode(String) encode byte[] decode(String)进行解码。 These are the only sane API's for Base64, byte[] encodeBase64(byte[]) is misleading, the result is US_ASCII-compatible text (so, a String , not byte[] ). 这些是Base64的唯一合理的API, byte[] encodeBase64(byte[])具有误导性,结果是与US_ASCII兼容的文本(因此,是String而不是 byte[] )。

Now, text has a charset and an encoding, String uses a fixed Unicode/UTF-16 charset/encoding combination internally, and you have to specify a charset/encoding when converting something from/to a String , either explicitly, or implicitly, using the platform's default encoding (which is what PrintStream.println() does). 现在,文本具有一个字符集和一种编码, String使用固定的Unicode / UTF-16字符集/编码组合,并且在使用String进行明/暗转换时,您必须指定一个字符集/编码。平台的默认编码(这是PrintStream.println()功能)。 Base64 text is pure US_ASCII, so you need to use that, or a superset of US_ASCII. Base64文本是纯US_ASCII,因此您需要使用该文本或US_ASCII的超集。 org.apache.commons.codec.binary.Base64 uses UTF8, which is a superset of US_ASCII, so all is well. org.apache.commons.codec.binary.Base64使用UTF8,它是US_ASCII的超集,因此一切都很好。 (OTOH, the internal java.util.prefs.Base64 uses the platform's default encoding, so I guess it would break if you start your JVM with, say, an UTF-16 encoding). (OTOH,内部java.util.prefs.Base64使用平台的默认编码,因此,如果您使用UTF-16编码启动JVM,我想它会中断)。

Back on topic: you've tried to print the decoded image (binary data) as text, which obviously hasn't worked. 回到正题:您尝试将解码的图像(二进制数据)打印为文本,这显然没有用。 PrintStream has write() methods that can write binary data, so you could use those, and you would get the same garbage as if you wrote the original image. PrintStream具有可写入二进制数据的write()方法,因此您可以使用它们,并且将得到与写入原始图像相同的垃圾。 It would be much better to use a FileOutputStream , and compare the resulting file with the original image file. 使用FileOutputStream并将结果文件与原始图像文件进行比较会更好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM