简体   繁体   English

Java Base64 MIME解码/编码会丢弃定界符

[英]Java Base64 MIME decoding/encoding throws away delimiters

I have a Base64-encoded string that looks like "data:image/png;base64,iVBORw0K" . 我有一个Base64编码的字符串,看起来像"data:image/png;base64,iVBORw0K" I'm trying to decode it back to binary and later encode it again into Base64 using java.util.Base64 . 我正在尝试将其解码回二进制,然后使用java.util.Base64其再次编码为Base64。 Strangely, after decoding and encoding again, I would lose the delimiters and get back "dataimage/pngbase64iVBORw0I=" . 奇怪的是,在再次解码和编码之后,我将丢失定界符并返回"dataimage/pngbase64iVBORw0I="

This is how I do the decoding and encoding (written in Scala, but you get the idea): 这是我进行解码和编码的方式(用Scala编写,但是您明白了):

import java.util.Base64

val b64mime = "data:image/png;base64,iVBORw0K"

val decoder = Base64.getMimeDecoder
val encoder = Base64.getMimeEncoder

println(encoder.encodeToString(decoder.decode(b64mime)))

Here is an example: https://scalafiddle.io/sf/TJY7eeg/0 这是一个示例: https : //scalafiddle.io/sf/TJY7eeg/0

This also happens with javax.xml.bind.DatatypeConverter . javax.xml.bind.DatatypeConverter也会发生这种情况。 What am I doing wrong? 我究竟做错了什么? Is this the expected behavior? 这是预期的行为吗?

The string you are trying to deal with looks like an example of a "data:" URL as specified in RFC 2397 您要处理的字符串看起来像RFC 2397中指定的“数据:” URL的示例

The correct way to deal with one of these is parse it into its components, and then decode only the component that is base64 encoded. 处理这些错误之一的正确方法是将其解析为其组件,然后解码经过base64编码的组件。 Here is the syntax 这是语法

   dataurl    := "data:" [ mediatype ] [ ";base64" ] "," data
   mediatype  := [ type "/" subtype ] *( ";" parameter )
   data       := *urlchar
   parameter  := attribute "=" value

So this says that everything up to the comma in your example is non-base64 data. 因此,这表示示例中用逗号表示的所有内容都是非base64数据。 You cannot simply treat the whole string as base64 because it contains characters that are not valid in any standard variant of the base64 encoding scheme. 您不能简单地将整个字符串视为base64,因为它包含在base64编码方案的任何标准变体中都无效的字符。

This Q&A talks about RFC 2397 parsers in Java: 此问答讨论了Java中的RFC 2397解析器:

Base64 doesnt have those characters in it. Base64中没有那些字符。 It looks like the decoder is ignoring those invalid characters. 看起来解码器正在忽略那些无效字符。

@ decoder.decode(";") 
res10: Array[Byte] = Array()

However if you just decode the last part you get what you want. 但是,如果仅解码最后一部分,您将获得所需的内容。

@ decoder.decode("iVBORw0K") 
res9: Array[Byte] = Array(-119, 80, 78, 71, 13, 10)
@ encoder.encodeToString(res9) 
res12: String = "iVBORw0K"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM