简体   繁体   中英

Get nodes from xml files

How to parse the xml file?

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> 
<sitemap> 
    <loc>link</loc>
    <lastmod>2011-08-17T08:23:17+00:00</lastmod> 
</sitemap> 
<sitemap>
    <loc>link</loc> 
    <lastmod>2011-08-18T08:23:17+00:00</lastmod> 
</sitemap> 
</sitemapindex>

I am new to XML, I tried this, but it seems to be not working :

        XmlDocument xml = new XmlDocument(); //* create an xml document object. 
        xml.Load("sitemap.xml");
        XmlNodeList xnList = xml.SelectNodes("/sitemapindex/sitemap");
        foreach (XmlNode xn in xnList)
        {
            String loc= xn["loc"].InnerText;
            String lastmod= xn["lastmod"].InnerText;
        }

The problem is that the sitemapindex element defines a default namespace. You need to specify the namespace when you select the nodes, otherwise it will not find them. For instance:

XmlDocument xml = new XmlDocument();
xml.Load("sitemap.xml");
XmlNamespaceManager manager = new XmlNamespaceManager(xml.NameTable);
manager.AddNamespace("s", "http://www.sitemaps.org/schemas/sitemap/0.9");
XmlNodeList xnList = xml.SelectNodes("/s:sitemapindex/s:sitemap", manager);

Normally speaking, when using the XmlNameSpaceManager , you could leave the prefix as an empty string to specify that you want that namespace to be the default namespace. So you would think you'd be able to do something like this:

// WON'T WORK
XmlDocument xml = new XmlDocument();
xml.Load("sitemap.xml");
XmlNamespaceManager manager = new XmlNamespaceManager(xml.NameTable);
manager.AddNamespace("", "http://www.sitemaps.org/schemas/sitemap/0.9"); //Empty prefix
XmlNodeList xnList = xml.SelectNodes("/sitemapindex/sitemap", manager); //No prefixes in XPath

However, if you try that code, you'll find that it won't find any matching nodes. The reason for this is that in XPath 1.0 (which is what XmlDocument implements), when no namespace is provided, it always uses the null namespace, not the default namespace. So, it doesn't matter if you specify a default namespace in the XmlNamespaceManager , it's not going to be used by XPath, anyway. To quote the relevant paragraph from the Official XPath Specification :

A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. This is the same way expansion is done for element type names in start and end-tags except that the default namespace declared with xmlns is not used: if the QName does not have a prefix, then the namespace URI is null (this is the same way attribute names are expanded). It is an error if the QName has a prefix for which there is no namespace declaration in the expression context.

Therefore, when the elements you are reading belong to a namespace, you can't avoid putting the namespace prefix in your XPath statements. However, if you don't want to bother putting the namespace URI in your code, you can just use the XmlDocument object to return the URI of the root element, which in this case, is what you want. For instance:

XmlDocument xml = new XmlDocument();
xml.Load("sitemap.xml");
XmlNamespaceManager manager = new XmlNamespaceManager(xml.NameTable);
manager.AddNamespace("s", xml.DocumentElement.NamespaceURI); //Using xml's properties instead of hard-coded URI
XmlNodeList xnList = xml.SelectNodes("/s:sitemapindex/s:sitemap", manager);

Sitemap has 2 sub nodes "loc" and "lastmod". The nodes that you are accessing are "name" and "url". that is why you are not getting any result. Also in your XML file the last sitemap tag is not closed properly with a corresponding Kindly try xn["loc"].InnerText and see if you get the desired result.

I would definitely use LINQ to XML instead of the older XmlDocument based XML API. You can accomplish what you are looking to do using the following code. Notice, I changed the name of the element that I am trying to get the value of to 'loc' and 'lastmod', because this is what is in your sample XML ('name' and 'url' did not exist):

XElement element = XElement.Parse(XMLFILE);
        IEnumerable<XElement> list = element.Elements("sitemap");
        foreach (XElement e in list)
        {
            String LOC= e.Element("loc").Value;
            String LASTMOD = e.Element("lastmod").Value;
        }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM