简体   繁体   English

将Base64转换为字符串会插入空格

[英]Converting Base64 to string inserts whitespaces

I'm trying to convert a Base64 encoded string to text. 我正在尝试将Base64编码的字符串转换为文本。 I'm using the following code: 我正在使用以下代码:

public static string Base64Decode(string base64EncodedData)
{
    var base64EncodedBytes = System.Convert.FromBase64String(base64EncodedData);
    return System.Text.Encoding.UTF8.GetString(base64EncodedBytes);
}

Somehow it does work but it puts whitespaces after each character.Furthermore, it adds an invalid character in the beginning of converted string. 它可以正常工作,但是会在每个字符后添加空格。此外,它在转换后的字符串的开头添加了无效字符。 The content in Base64 string is an XML so when it converts it to text and puts whitespaces, the XML becomes invalid. Base64字符串中的内容是XML,因此当将其转换为文本并放入空格时,该XML无效。 Is there any alternative to this? 除此之外,还有其他选择吗?

here's a sample output after conversion: 这是转换后的示例输出:

? < ? x m l  v e r s i o n = " 1 . 0 "  e n c o d i n g = " U T F - 1 6 "  s t a n d a l o n e = " n o " ? >   < I m p o r t >     < o p t i o n s >       < P r o c N a m e > E R P N u m b e r < / P r o c N a m e >       < J o b I D > A N L 0 0 1 8 5 0 < / J o b I D >     < / o p t i o n s >     < R o w >       < D o c I d  / >       < E R P N u m b e r  / >     < / R o w >   < / I m p o r t > 

It looks like the original binary data is string converted to bytes using UTF-16, which matches the encoding="UTF-16" part of the text. 看起来原始二进制数据已使用UTF-16字符串转换为字节,该字符串与文本的encoding="UTF-16"部分匹配。 You need to use the right encoding when converting the binary data back to a string: 将二进制数据转换回字符串时,需要使用正确的编码:

return Encoding.Unicode.GetString(base64EncodedBytes);

That's assuming you can't change what's producing the data in the first place. 前提是您一开始就无法更改生成数据的内容。 If you can change that to use UTF-8 instead, you'll end up with half as much data if the text is all ASCII characters... 如果您可以将其更改为使用UTF-8,那么如果文本全部为ASCII字符,则最终将获得一半的数据...

As Jon Skeet explained in his answer , the string appears to be encoded in UTF-16 not UTF-8. 正如Jon Skeet回答中解释的那样,该字符串似乎是用UTF-16而不是UTF-8编码的。

You also wrote 你还写了

Furthermore, it adds an invalid character in the beginning of converted string. 此外,它在转换后的字符串的开头添加了无效字符。

This invalid character is almost certainly a byte order mark , a small prefatory sequence of bytes that indicates the specific encoding used in the stream. 这个无效字符几乎可以肯定是一个字节顺序标记 ,这是一个小的字节序,表示流中使用的特定编码。 Given its presence, you can configure a StreamReader to detect the encoding specified by using the new StreamReader(Stream, true) constructor: 给定它的存在,您可以配置一个StreamReader来检测通过使用new StreamReader(Stream, true)构造函数指定的编码:

public static string Base64Decode(string base64EncodedData)
{
    var base64EncodedBytes = System.Convert.FromBase64String(base64EncodedData);
    using (var reader = new StreamReader(new MemoryStream(base64EncodedBytes), true))
    {
        return reader.ReadToEnd();
    }
}

Note that the StreamReader will consume the byte order mark during processing so it is not included in the returned string. 请注意, StreamReader将在处理期间消耗字节顺序标记,因此它不包含在返回的字符串中。

Alternatively, since your base64 data is actually XML, and XML contains its own encoding declaration , you could extract the byte array and parse it directly using an XmlReader : 另外,由于您的base64数据实际上是XML,并且XML包含其自己的编码声明 ,因此您可以提取字节数组并使用XmlReader直接对其进行解析:

public static XmlReader CreateXmlReaderFromBase64(string base64EncodedData, XmlReaderSettings settings = null)
{
    var base64EncodedBytes = System.Convert.FromBase64String(base64EncodedData);
    return XmlReader.Create(new MemoryStream(base64EncodedBytes), settings);
}

According to the docs , XmlReader.Create(Stream) will detect encoding as required: 根据docsXmlReader.Create(Stream)将根据需要检测编码:

The XmlReader scans the first bytes of the stream looking for a byte order mark or other sign of encoding. XmlReader扫描流的第一个字节,以查找字节顺序标记或其他编码符号。 When encoding is determined, the encoding is used to continue reading the stream, and processing continues parsing the input as a stream of (Unicode) characters. 确定编码后,将使用编码继续读取流,然后处理继续将输入解析为(Unicode)字符流。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM