简体   繁体   English

XmlSerializer 按属性过滤

[英]XmlSerializer filter by attribute

When deserializing an XML to an entity class using XmlSerializer, is it possible to filter by an attribute?使用 XmlSerializer 将 XML 反序列化为实体 class 时,是否可以按属性进行过滤? For example, let's say I have an item which can be of type "a" or type "b".例如,假设我有一个可以是“a”类型或“b”类型的项目。 I want to deserialize all items but only those of type "a".我想反序列化所有项目,但只有“a”类型的项目。

I need this because my real situation is that our endpoint receives very big XMLs (some can be upwards of 100MB) with hundreds of thousands of tags of type <item> but I need only some of them - those of type "a".我需要这个,因为我的实际情况是我们的端点接收非常大的 XML(有些可能超过 100MB),其中包含数十万个<item>类型的标签,但我只需要其中的一些 - “a”类型的标签。 I want to avoid allocations for the rest (including their child tags which are not few).我想避免分配给 rest(包括他们的子标签,这些标签并不多)。

Example XML:示例 XML:

<root>
  <item type="a"/>
  <item type="a"/>
  <item type="b"/>
  <item type="c"/>
</root>

Entities:实体:

[XmlRoot("root")]
public class Root {
    [XmlElement("item")]
    public Item[] Items { get; set; }
}
public class Item  {
    [XmlAttribute("type")]
    [DeserializeIfValueIs("a")] // <-- Is there something like this?
    public string Type { get; set; }
}

Code:代码:

var serializer = new XmlSerializer(typeof(Root));
var dto = (Root) serializer.Deserialize(XmlReader.Create("input.xml"));
// Show the results - {"Items":[{"Type":"a"},{"Type":"a"},{"Type":"b"},{"Type":"c"}]}
Console.WriteLine(JsonConvert.SerializeObject(dto));

How do I make it allocate objects only for type "a" items?如何让它只为“a”类型的项目分配对象?

Obligatory note: This is neither an XY problem nor premature optimization.强制性说明:这既不是 XY 问题,也不是过早优化。 We have identified that we need to improve performance in this with profiling and so on.我们已经确定我们需要通过分析等来提高这方面的性能。 Also filtering out the values post-deserialization doesn't help - by that time the allocations have already been made and will have to be garbage-collected.同样过滤掉反序列化后的值也无济于事 - 到那时分配已经完成并且必须进行垃圾收集。

This is possible by handling the de-serialization process ourselves (at least for the root class)这可以通过我们自己处理反序列化过程来实现(至少对于根类)

Please let me remind you that the XML content you provided is insufficient to run unit tests on, so this is a very basic implementation which, however, should work for you directly or by just tweaking a little bit over here and there.请让我提醒您,您提供的 XML 内容不足以运行单元测试,所以这是一个非常基本的实现,但是应该直接为您工作,或者只是在这里和那里稍微调整一下。

First of all, we change our Item class XML serialization attribute to root.首先,我们将 Item class XML 序列化属性更改为 root。 The "Why" will be answered soon. “为什么”很快就会得到回答。

[XmlRoot("item")]
public class Item
{
    [XmlAttribute("type")]
    public string Type { get; set; }

    [XmlElement("prop1")]
    public int Prop1 { get; set; }
}

I've also added a simple integer property to prove that the deserialization works as expected.我还添加了一个简单的 integer 属性来证明反序列化按预期工作。

I also changed the XML content to match the new type, for testing.我还更改了 XML 内容以匹配新类型,以进行测试。

<root>
  <item type="b">
    <prop1>5</prop1>
  </item>
  <item type="a">
    <prop1>5</prop1>
  </item>
  <item type="a">
    <prop1>5</prop1>
  </item>
  <item type="b">
    <prop1>5</prop1>
  </item>
  <item type="c">
    <prop1>5</prop1>
  </item>
</root>

And now comes the Root class, which implements IXmlSerializable explicitly now:现在出现了 Root class,它现在显式地实现了 IXmlSerializable:

[XmlRoot("root")]
public class Root : IXmlSerializable
{
    [XmlElement("item")]
    public Item[] Items { get; set; }

    // These two methods are not implemented for you need to deserialize only,
    // and because you haven't provided the schema for your XML content
    System.Xml.Schema.XmlSchema IXmlSerializable.GetSchema() { throw new NotImplementedException(); }
    void IXmlSerializable.WriteXml(System.Xml.XmlWriter writer) { throw new NotImplementedException(); }

    void IXmlSerializable.ReadXml(System.Xml.XmlReader reader)
    {
        // The element is <root> when here for the first time.

        // Maintain a list to keep items with type "a"
        List<Item> typeAItems = new List<Item>();

        // Create a serializer for the type Item
        XmlSerializer deserializer = new XmlSerializer(typeof(Item));

        while (reader.Read())
        {
            // The code is self explanatory.
            // Skip() will help omitting unnecessary reads
            // if we are not interested in the Item
            if (reader.IsStartElement() && reader.Name == "item")
            {
                if (reader.GetAttribute("type") == "a")
                {
                    // This works, and deserializes the current node
                    // into an Item object. When the deserialization
                    // is completed, the reader is at the beginning
                    // of the next <Item> element
                    typeAItems.Add((Item)deserializer.Deserialize(reader));
                }
                else
                {
                    // skip element with all its children
                    reader.Skip();
                }
            }
            else
            {
                // skip element with all its children
                reader.Skip();
            }
        }
        Items = typeAItems.ToArray();
    }
}

The deserialization logic is kept the same, like new XmlSerializer(typeof(Root)).Deserialize().反序列化逻辑保持不变,如 new XmlSerializer(typeof(Root)).Deserialize()。

The rest.. is to test. rest.. 用于测试。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM