简体   繁体   English

C#base64编码/解码与对象序列化问题

[英]C# base64 encoding/decoding with serialization of objects issue

I'm using serialization and deserialization in C# for my Project (which is a Class). 我在C#中使用序列化和反序列化来实现我的项目(这是一个类)。 They are serialized and saved to an XML file. 它们被序列化并保存到XML文件中。 When loading the Project, all goes well. 加载项目时,一切顺利。

Now I'm trying to encode the serialized Project to Base64 and then save the file, which goes well too. 现在我正在尝试将序列化项目编码为Base64,然后保存文件,这也很顺利。 The first line of the file (before encoded!) looks like this: 该文件的第一行(编码之前!)如下所示:

<?xml version="1.0" encoding="utf-8"?>
  <Project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">

When I decode the file, there's a ? 当我解码文件时,有一个 added in front of the line: 在行前添加:

?<?xml version="1.0" encoding="utf-8"?>
  <Project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">

The code I use to encode: 我用来编码的代码:

byte[] toEncodeAsBytes = System.Text.ASCIIEncoding.ASCII.GetBytes(toEncode);
        string returnValue = System.Convert.ToBase64String(toEncodeAsBytes);
        return returnValue;

And the code for decoding: 和解码代码:

byte[] encodedDataAsBytes = System.Convert.FromBase64String(encodedData);
        string returnValue = System.Text.ASCIIEncoding.ASCII.GetString(encodedDataAsBytes);
        return returnValue;

What can this be and how can I fix this? 这可以是什么,我该如何解决这个问题?

The file declares itself as UTF-8 - so why are you using ASCII to encode it into binary? 该文件将自己声明为UTF-8 - 那么为什么使用ASCII将其编码为二进制? There are many characters in UTF-8 which can't be represented in ASCII. UTF-8中有许多字符无法用ASCII表示。 Do you even have to have the file in text form in-memory to start with? 你甚至必须以文本形式在内存中开始使用文件吗? Why not just load it as binary data to start with (eg File.ReadAllBytes )? 为什么不把它作为二进制数据加载(例如File.ReadAllBytes )?

If you do need to start with a string, use Encoding.UTF-8 (or Encoding.Unicode , although that will probably result in a bigger byte array) and everything should be fine. 如果你确实需要从一个字符串开始,使用Encoding.UTF-8 (或Encoding.Unicode ,虽然这可能会导致更大的字节数组),一切都应该没问题。 That extra character is a byte order mark - which can't be represented in ASCII, hence the "?" 额外的字符是字节顺序标记 - 不能用ASCII表示,因此“?” replacement character. 替换角色。

At a guess ? 在猜测? represents the Byte-Order-Marker which is a character that cannot be represented in ASCII. 表示字节顺序标记,它是一个无法用ASCII表示的字符。 Why are you not using the UTF-8 encoding? 你为什么不使用UTF-8编码?

byte[] toEncodeAsBytes = System.Text.Encoding.UTF8.GetBytes(toEncode);

Rather than having to worry about encoding, perhaps just use XmlWriter.Create(outPath) , and pass that XmlWriter to your serialization code. 不必担心编码,也许只需使用XmlWriter.Create(outPath) ,并将该XmlWriter传递给序列化代码。 That will avoid this issue, and other issues (such as having to buffer very large strings for large object graphs). 这将避免此问题和其他问题(例如必须为大对象图缓冲非常大的字符串)。 There is an overload that accepts an XmlWriterSettings for finer control. 有一个重载接受XmlWriterSettings以实现更好的控制。

XmlWriter is accepted by most xml code. 大多数xml代码都接受XmlWriter

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM