简体   繁体   English

属性中包含无效字符的XML序列化和反序列化对象

[英]XML serialization and deserialization objects containing invalid chars in properties

I know this was asked before for many times but still don't see a good solution. 我知道这已经被问过很多次了,但是仍然没有一个好的解决方案。
There is an object like this: 有一个这样的对象:

public class DTO
{
    public string Value;
}

I need to serialize it in the Exporter app and then deserialize in the Importer. 我需要在导出器应用程序中对其进行序列化,然后在导入器中进行反序列化。
Object's Value may contain characters who are not valid for XML (ex 0x8). 对象的值可能包含对XML无效的字符(例如0x8)。 I need to either let Exporter remove such chars or let Importer successfully load object containing the chars. 我需要让Exporter删除此类字符,或者让Importer成功加载包含字符的对象。 I wouldn't like to clean up objects before serialization because I have tens of them with tens string properties each. 我不想在序列化之前清理对象,因为我有数十个对象,每个对象都有数十个字符串属性。

  1. Importer side. 进口商方。 If I enable CheckCharacters here then I'll get error on serialization step. 如果我在此处启用CheckCharacters,则在序列化步骤中会出现错误。 I don't see a way to custom control all strings at one spot. 我看不到一种可以自定义控制所有字符串的方法。 If I disable it then the XML will contain invalid char. 如果禁用它,那么XML将包含无效的char。

     XmlWriterSettings xmlWriterSettings = new XmlWriterSettings { CheckCharacters = false }; XmlSerializer xmlSerializer = new XmlSerializer(typeof(DTO)); StringBuilder sb = new StringBuilder(); DTO dto = new DTO { Value = Convert.ToChar(0x08).ToString() }; using (XmlWriter xmlWriter = XmlWriter.Create(sb, xmlWriterSettings)) { xmlSerializer.Serialize(xmlWriter, dto); xmlWriter.Flush(); xmlWriter.Close(); } 
  2. Ok, if I let invalid char go to XML then there is no way to handle it on Import side. 好的,如果我让无效的char进入XML,则无法在Import端处理它。 Even if CheckCharacters = false, the error occurs on Deserialize() call: 即使CheckCharacters = false,也会在Deserialize()调用中发生错误:

     var _reader = XmlReader.Create(File.OpenText(path), new XmlReaderSettings() { CheckCharacters = false }); _reader.MoveToContent(); var outerXml = _reader.ReadOuterXml(); xmlSerializer.Deserialize(new StringReader(outerXml)); <== getting error here 

Is there a way to remove invalid chars in either step and let the object exported/imported without errors? 有没有一种方法可以在任一步骤中删除无效字符,并允许对象导出/导入而没有错误?

That was my bad :( 那是我的坏:(
In here: 在这里:

var outerXml = _reader.ReadOuterXml();
xmlSerializer.Deserialize(new StringReader(outerXml)); <== getting error here

xmlSerializer was actually using an implicitly created internal XmlReader which did check characters. xmlSerializer实际上是使用隐式创建的内部XmlReader来检查字符。 All I had to do four hours ago was: 我四个小时前要做的就是:

xmlSerializer.Deserialize(_reader);

I'm not saying this is a great solution but code below will remove non UTF8 characters when serializing : 我并不是说这是一个很好的解决方案,但是下面的代码将在序列化时删除非UTF8字符:

    public class DTO
    {
        private string _value { get; set; }
        public string Value
        {
            get { return Encoding.UTF8.GetString(_value.Select(x => (byte)((int)x)).ToArray()); }
            set { _value = value; }
        }

    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM