简体   繁体   中英

How to deal with namespaces in XML in XmlDocument c#

I have several XML documents, all of which have the same structure (element names, attribute names and hierarchy).

However, some of the elements and attribute have custom namespaces in each XML document which are not known at design time. They change, don't ask...

How can I deal with this when traversing the documents using a single set of XPath?

Should I remove all the namespaces before processing?

Can I automatically register all namespaces with an XmlNamespaceManager?

Any thoughts?

Update: some examples (with namespace declarations omitted for clarity):

<root>
    <child attr="val" />
</root>

<root>
    <x:child attr="val" />
</root>

<root>
    <y:child z:attr="val" />
</root>

Thanks

Suppose you have following xml:

  <root xmlns="first">
   <el1 xmlns="second">
    <el2 xmlns="third">...

You can write you queries to ignore namespaces in the following way: /*[local-name()='root']/*[local-name()='el1']/*[local-name()='el2'] etc. Of course you can iterate over the whole document to get namespaces and load them into nsmanager. But in general case this will cause you to evaluate every node in the document. In this case it will be faster to just treat document as a tree of objects and don't use XPath.

I believe you'll find some good insight in this Stackoverflow thread

XPath + Namespace Driving me crazy

In my opinion you have either of two solutions:

1- If the set of all possible namespaces are know before hand, then you can register them all in a XmlNamespaceManager before you begin parsing

2- Use Xpath namespace-agnostic selectors

Of course you can always scrub the xml document from any inline namespaces and start your parsing on a clean unfiorm xml without namespace.. but honestly I don't see the gain in adding this overhead step.

Scott Hanselman has a nice article about extracting all of the XML Namespaces in an XML document. Presumably, when you get all of the XML Namespaces, you can just iterate over all of them and register them in your namespace manager.

You could try something like this to strip the namespaces:

//Implemented based on interface, not part of algorithm
public string RemoveAllNamespaces(string xmlDocument)
{
    return RemoveAllNamespaces(XElement.Parse(xmlDocument)).ToString();    
}

//Core recursion function
private XElement RemoveAllNamespaces(XElement xmlDocument)
{
    if (!xmlDocument.HasElements)
    {
        XElement xElement = new XElement(xmlDocument.Name.LocalName);
        xElement.Value = xmlDocument.Value;
        return xElement;
    }
    return new XElement(xmlDocument.Name.LocalName, xmlDocument.Elements().Select(el => RemoveAllNamespaces(el)));
}

See Peter Stegnar's answer here for more details:
How to remove all namespaces from XML with C#?

您还可以将直接节点测试与通配符一起使用,这将匹配任何名称空间(或缺少名称空间):

$your-document/*:root/*:child/@*:attr

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM