简体   繁体   English

如何在Java中定义Base64编码的字符串的编码

[英]how to define encoding of a Base64 encoded string in java

I received an XML file with a PDF Attachment in it encoded as Base64 string. 我收到了一个XML文件,其中带有PDF附件,编码为Base64字符串。 I am trying to generate a PDF file out of it. 我正在尝试从中生成PDF文件。 Following code works well: 以下代码效果很好:

String base64encodedPdf =" ....   ";
byte[] imgBytes = javax.xml.bind.DatatypeConverter.parseBase64Binary(base64encodedPdf);
IOUtils.write(imgBytes, new FileOutputStream("C:\\\\test.pdf"));

Problem arises when attachment data is too big to copy to editor directly, thought I can copy it to a text file and read file and convert to String . 当附件数据太大而无法直接复制到编辑器时,就会出现问题,以为我可以将其复制到文本文件并读取文件并转换为String This is how I do it: 这是我的方法:

org.apache.commons.io.FileUtils.readFileToString(file, encoding)

I am curious what encoding shall I specify... UTF-8 , UTF-16 and why? 我很好奇应该指定哪种编码... UTF-8UTF-16为什么?

EDIT: 编辑:

This is the meta-information available to me 这是我可用的元信息

<AttachmentType tc="1">Document</AttachmentType>
<MimeType>application/pdf</MimeType>
<TransferEncodingTypeString>Base64</TransferEncodingTypeString>
<TransferEncodingTypeTC tc="4">Base64</TransferEncodingTypeTC>

It depends on what encoding you used when writing into the text file. 这取决于您写入文本文件时使用的编码。 Java text-related IO classes such as PrintWriter has a constructor that allows you to explicitly define the encoding, eg: 与Java文本相关的IO类(例如PrintWriter)具有一个构造函数,该构造函数允许您显式定义编码,例如:

new PrintWriter("foo.txt", "UTF-8");

If you don't do so, it will use the default encoding which might vary depending on platform / JVM setting. 如果您不这样做,它将使用默认编码,具体取决于平台/ JVM设置。 You check your platform's default encoding using 您使用以下命令检查平台的默认编码

Charset.defaultCharset()

But it's a good practice to always explicitly specify your intended encoding when writing to a file 但是,最好在写入文件时始终明确指定所需的编码

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM