NET中正确的XML序列化和“混合”类型的反序列化

Question

我当前的任务涉及编写用于处理HL7 CDA文件的类库。
这些HL7 CDA文件是具有定义的XML模式的XML文件，因此我使用xsd.exe生成.NET类以进行XML序列化和反序列化。

XML模式包含各种类型，这些类型包含mixed =“ true”属性 ，该属性指定此类型的XML节点可以包含与其他XML节点混合的普通文本。
这些类型之一的XML模式的相关部分如下所示：

<xs:complexType name="StrucDoc.Paragraph" mixed="true">
    <xs:sequence>
        <xs:element name="caption" type="StrucDoc.Caption" minOccurs="0"/>
        <xs:choice minOccurs="0" maxOccurs="unbounded">
            <xs:element name="br" type="StrucDoc.Br"/>
            <xs:element name="sub" type="StrucDoc.Sub"/>
            <xs:element name="sup" type="StrucDoc.Sup"/>
            <!-- ...other possible nodes... -->
        </xs:choice>
    </xs:sequence>
    <xs:attribute name="ID" type="xs:ID"/>
    <!-- ...other attributes... -->
</xs:complexType>

为该类型生成的代码如下所示：

/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.3038")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(TypeName="StrucDoc.Paragraph", Namespace="urn:hl7-org:v3")]
public partial class StrucDocParagraph {

    private StrucDocCaption captionField;

    private object[] itemsField;

    private string[] textField;

    private string idField;

    // ...fields for other attributes...

    /// <remarks/>
    public StrucDocCaption caption {
        get {
            return this.captionField;
        }
        set {
            this.captionField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute("br", typeof(StrucDocBr))]
    [System.Xml.Serialization.XmlElementAttribute("sub", typeof(StrucDocSub))]
    [System.Xml.Serialization.XmlElementAttribute("sup", typeof(StrucDocSup))]
    // ...other possible nodes...
    public object[] Items {
        get {
            return this.itemsField;
        }
        set {
            this.itemsField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlTextAttribute()]
    public string[] Text {
        get {
            return this.textField;
        }
        set {
            this.textField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="ID")]
    public string ID {
        get {
            return this.idField;
        }
        set {
            this.idField = value;
        }
    }

    // ...properties for other attributes...
}

如果我反序列化段落节点如下所示的XML元素：

<paragraph>first line<br /><br />third line</paragraph>

结果是像这样读取item和text数组：

itemsField = new object[]
{
    new StrucDocBr(),
    new StrucDocBr(),
};
textField = new string[]
{
    "first line",
    "third line",
};

由此无法确定文本和其他节点的确切顺序。
如果我再次序列化它，结果将看起来完全像这样：

<paragraph>
    <br />
    <br />first linethird line
</paragraph>

默认的序列化程序只会先序列化项目，然后再序列化文本。

我尝试在StrucDocParagraph类上实现IXmlSerializable ，以便可以控制内容的反序列化和序列化，但是由于涉及的类太多，而且还没有找到解决方案，因为它不知道是否存在，所以它相当复杂。努力得到回报。

是否有某种简单的解决方法来解决此问题，或者是否有可能通过IXmlSerializable执行自定义序列化？ 还是应该只使用XmlDocument或XmlReader / XmlWriter处理这些文档？

Answer 1

为了解决这个问题，我不得不修改生成的类：

将XmlTextAttribute从Text属性移到Items属性，然后添加参数Type = typeof(string)
删除Text属性
删除textField字段

结果， 生成的代码（已修改）如下所示：

/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.3038")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(TypeName="StrucDoc.Paragraph", Namespace="urn:hl7-org:v3")]
public partial class StrucDocParagraph {

    private StrucDocCaption captionField;

    private object[] itemsField;

    private string idField;

    // ...fields for other attributes...

    /// <remarks/>
    public StrucDocCaption caption {
        get {
            return this.captionField;
        }
        set {
            this.captionField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute("br", typeof(StrucDocBr))]
    [System.Xml.Serialization.XmlElementAttribute("sub", typeof(StrucDocSub))]
    [System.Xml.Serialization.XmlElementAttribute("sup", typeof(StrucDocSup))]
    // ...other possible nodes...
    [System.Xml.Serialization.XmlTextAttribute(typeof(string))]
    public object[] Items {
        get {
            return this.itemsField;
        }
        set {
            this.itemsField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="ID")]
    public string ID {
        get {
            return this.idField;
        }
        set {
            this.idField = value;
        }
    }

    // ...properties for other attributes...
}

现在，如果我反序列化段落节点如下所示的XML元素：

<paragraph>first line<br /><br />third line</paragraph>

结果是像这样读取item数组：

itemsField = new object[]
{
    "first line",
    new StrucDocBr(),
    new StrucDocBr(),
    "third line",
};

这正是我所需要的 ，项目的顺序及其内容是正确的 。
如果我再次序列化此结果，则结果再次正确：

<paragraph>first line<br /><br />third line</paragraph>

正确的方向指向我的就是纪尧姆（Guillaume）的答案，我还认为这样做一定是可能的。 然后在MSDN文档中有XmlTextAttribute ：

您可以将XmlTextAttribute应用于返回字符串数组的字段或属性。 您也可以将属性应用于Object类型的数组，但是必须将Type属性设置为string。 在这种情况下，插入到数组中的所有字符串都将序列化为XML文本。

因此，序列化和反序列化现在可以正常工作，但是我不知道是否还有其他副作用。 也许不可能再使用xsd.exe从这些类中生成模式了，但是我还是不需要它。

Answer 2

我遇到了同样的问题，并遇到了更改xsd.exe生成的.cs的解决方案。 尽管它确实有效，但是我对修改生成的代码并不满意，因为我需要在每次重新生成类时记住这样做。 这也导致了一些尴尬的代码，这些代码必须测试并强制转换为mailto元素的XmlNode []。

我的解决方案是重新考虑xsd。 我放弃了使用混合类型，并本质上定义了我自己的混合类型。

我有这个

XML: <text>some text <mailto>me@email.com</mailto>some more text</text>

<xs:complexType name="text" mixed="true">
    <xs:sequence>
      <xs:element minOccurs="0" maxOccurs="unbounded" name="mailto" type="xs:string" />
    </xs:sequence>
  </xs:complexType>

并更改为

XML: <mytext><text>some text </text><mailto>me@email.com</mailto><text>some more text</text></mytext>

<xs:complexType name="mytext">
    <xs:sequence>
      <xs:choice minOccurs="0" maxOccurs="unbounded">
        <xs:element name="text">
          <xs:complexType>
            <xs:simpleContent>
              <xs:extension base="xs:string" />
            </xs:simpleContent>
          </xs:complexType>
        </xs:element>
        <xs:element name="mailto">
          <xs:complexType>
            <xs:simpleContent>
              <xs:extension base="xs:string" />
            </xs:simpleContent>
          </xs:complexType>
        </xs:element>
      </xs:choice>
    </xs:sequence>
  </xs:complexType>

现在，我生成的代码为我提供了一个myText类：

public partial class myText{

    private object[] itemsField;

    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute("mailto", typeof(myTextTextMailto))]
    [System.Xml.Serialization.XmlElementAttribute("text", typeof(myTextText))]
    public object[] Items {
        get {
            return this.itemsField;
        }
        set {
            this.itemsField = value;
        }
    }
}

元素的顺序现在保留在序列化/反序列化中，但是我必须针对myTextTextMailto和myTextText类型进行测试/ myTextTextMailto转换/编程。

只是以为我会将它作为对我有用的替代方法。

Answer 3

关于什么

itemsField = new object[] 
{ 
    "first line", 
    new StrucDocBr(), 
    new StrucDocBr(), 
    "third line", 
};

？

NET中正确的XML序列化和“混合”类型的反序列化

问题描述

3 个解决方案

解决方案1
22 已采纳 2010-04-06 10:03:32

解决方案2
3 2011-03-24 14:43:22

解决方案3
0 2010-04-02 15:25:34

NET中正确的XML序列化和“混合”类型的反序列化

问题描述

3 个解决方案

解决方案1 22 已采纳 2010-04-06 10:03:32

解决方案2 3 2011-03-24 14:43:22

解决方案3 0 2010-04-02 15:25:34

解决方案1
22 已采纳 2010-04-06 10:03:32

解决方案2
3 2011-03-24 14:43:22

解决方案3
0 2010-04-02 15:25:34