简体   繁体   中英

Using XmlReader class to parse XML with elements of the same name

I'm re-writing some code that uses a XmlDocument to parse some XML. I want to use a XmlReader instead to see if I can get some performance improvements. The structure of the XML looks like this:

<items>
   <item id="1" desc="one">
      <itemBody date="2012-11-12" />
   </item>
   <item id="2" desc="two">
      <itemBody date="2012-11-13" />
   </item>
   <item id="3" desc="three">
      <itemBody date="2012-11-14" />
   </item>
   <item id="4" desc="four">
      <itemBody date="2012-11-15" />
   </item>
</items>

Basically, I need to iterate through all the <item> elements. Like I said, the old code works like this:

XmlDocument document = new XmlDocument();

// load XML into XmlDocument
document.LoadXml(xml);

// use xpath to split into individual item
string xPath = @"items/item";
XmlNodeList nodeList = document.SelectNodes(xPath);

// loop through each item
for (int nodeIndex = 0; nodeIndex < nodeList.Count; nodeIndex++)
{
    // do something with the XmlNode
    nodeList[nodeIndex];
}

This works fine, but I think using a XmlReader would be faster. So I've written this:

XmlReader xmlReader = XmlReader.Create(new StringReader(xml));

while (xmlReader.Read())
{                       
   if (xmlReader.Name.Equals("item") && (xmlReader.NodeType == XmlNodeType.Element))
   {
      string id = xmlReader.GetAttribute("id");                 
      string desc = xmlReader.GetAttribute("desc");
      string elementXml = xmlReader.ReadOuterXml();
   }
}

However, this code only reads the first <item> element. The ReadOuterXml() is breaking the loop. Does anybody know how to get around this? Or is this type of parsing not possible with a XmlReader? I've having to do this using .NET version 2 :( So I can't use LINQ.

Just tested your code in LinqPad. Works well.

 var xml = @"<items>
   <item id='1' desc='one' />
   <item id='2' desc='two' />
   <item id='3' desc='three' />
   <item id='4' desc='four' />
</items>";
XmlReader xmlReader = XmlReader.Create(new StringReader(xml));

while (xmlReader.Read())
{   
   if (xmlReader.Name.Equals("item") && (xmlReader.NodeType == XmlNodeType.Element))
   {
      string id = xmlReader.GetAttribute("id");              
      string desc = xmlReader.GetAttribute("desc");
      Console.WriteLine("{0} {1}", id, desc);
   }
}

Output:

1 one
2 two
3 three
4 four

Maybe there is something wrong with your XML.

The following seems to work :-

        StringBuilder xml = new StringBuilder();

        xml.Append("<items>");
        xml.Append("<item id=\"1\" desc=\"one\">");
        xml.Append("<itembody id=\"10\"/>");
        xml.Append("</item>");
        xml.Append("<item id=\"2\" desc=\"two\">");
        xml.Append("<itembody id=\"20\"/>");
        xml.Append("</item>");
        xml.Append("<item id=\"3\" desc=\"three\">");
        xml.Append("<itembody id=\"30\"/>");
        xml.Append("</item>");
        xml.Append("</items>");

        using (XmlTextReader tr = new XmlTextReader(new StringReader(xml.ToString())))
        {
            bool canRead = tr.Read();
            while (canRead)
            {
                if ((tr.Name == "item") && tr.IsStartElement())
                {
                    Console.WriteLine(tr.GetAttribute("id"));
                    Console.WriteLine(tr.GetAttribute("desc"));
                    string outerxml = tr.ReadOuterXml();
                    Console.WriteLine(outerxml);

                    canRead = (outerxml != string.Empty);
                }
                else
                {
                    canRead = tr.Read();
                }
            }
        }

If you can use Linq, here is an alternative way:

class Program
{
    static void Main(string[] args)
    {

        const string xml = @"<items>
                          <item id='1' desc='one'>
                            <itemBody date='2012-11-12' />
                          </item>
                          <item id='2' desc='two'>
                            <itemBody date='2012-11-13' />
                          </item>
                          <item id='3' desc='three'>
                            <itemBody date='2012-11-14' />
                          </item>
                          <item id='4' desc='four'>
                            <itemBody date='2012-11-15' />
                          </item>
                        </items>";

        var xmlReader = XmlReader.Create(new StringReader(xml));

        XElement element = XElement.Load(xmlReader, LoadOptions.SetBaseUri);

        IEnumerable<XElement> items = element.DescendantsAndSelf("item");

        foreach (var xElement in items)
        {
            string id = GetAttributeValue("id", xElement);
            string desc = GetAttributeValue("desc", xElement);
            string itemBody = GetElementValue("itemBody", "date", xElement);

            Console.WriteLine("id = {0}, desc = {1}, date = {2}", id, desc, itemBody);
        }

        Console.ReadLine();
    }

    private static string GetElementValue(string elementName, string attributeName, XElement element)
    {
        XElement xElement = element.Element(elementName);

        string value = string.Empty;

        if (xElement != null)
        {
            XAttribute xAttribute = xElement.Attribute(attributeName);

            if (xAttribute != null)
            {
                value = xAttribute.Value;
            }
        }

        return value;
    }

    private static string GetAttributeValue(string attributeName, XElement element)
    {
        XAttribute xAttribute = element.Attribute(attributeName);

        string value = string.Empty;
        if (xAttribute != null)
        {
            value = xAttribute.Value;
        }

        return value;
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM