简体   繁体   English

XML命名空间和XPath

[英]XML namespaces and XPath

I have an application that has to load XML document and output nodes depending on XPath. 我有一个应用程序,必须根据XPath加载XML文档和输出节点。

Suppose I start with a document like this: 假设我从这样的文档开始:

<aaa>
  ...[many nodes here]...
  <bbb>text</bbb>
  ...[many nodes here]...
  <bbb>text</bbb>
  ...[many nodes here]...
</aaa>

With XPath //bbb 使用XPath //bbb

So far everything is nice. 到目前为止一切都很好。

And selection doc.SelectNodes("//bbb"); 并选择doc.SelectNodes("//bbb"); returns the list of required nodes. 返回所需节点的列表。

Then someone uploads a document with one node like <myfancynamespace:foo/> and extra namespace in the root tag, and everything breaks. 然后有人上传一个文件,其中包含一个节点,如<myfancynamespace:foo/>和根标签中的额外命名空间,一切都会中断。

Why? 为什么? //bbb does not give a damn about myfancynamespace , theoretically it should even be good with //myfancynamespace:foo , as there is no ambiguity, but the expression returns 0 results and that's it. //bbbmyfancynamespace ,理论上它甚至应该对//myfancynamespace:foo ,因为没有歧义,但表达式返回0结果就是这样。

Is there a workaround for this behavior? 这种行为有解决方法吗?

I do have a namespace manager for the document, and I am passing it to the Xpath query. 我确实有一个文档的命名空间管理器,我将它传递给Xpath查询。 But the namespaces and the prefixes are unknown to me, so I can't add them before the query. 但我不知道名称空间和前缀,所以我无法在查询之前添加它们。

Do I have to pre-parse the document to fill the namespace manager before I do any selections? 在进行任何选择之前,是否必须预先解析文档以填充命名空间管理器? Why on earth such behavior, it just doesn't make sense. 为什么在地球上这样的行为,它只是没有意义。

EDIT: 编辑:

I'm using: XmlDocument and XmlNamespaceManager 我正在使用: XmlDocumentXmlNamespaceManager

EDIT2: EDIT2:

XmlDocument doc = new XmlDocument();
doc.XmlResolver = null;
XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable);
//I wish I could:
//nsmgr.AddNamespace("magic", "http://magicnamespaceuri/
//...
doc.LoadXML(usersuppliedxml);
XmlNodeList nodes = doc.SelectNodes(usersuppliedxpath, nsmgr);//usersuppliedxpath -> "//bbb"

//nodes.Count should be > 0, but with namespaced document they are 0

EDIT3: Found an article which describes the actual scenario of the issue with one workaround, but not very pretty workaround: http://codeclimber.net.nz/archive/2008/01/09/How-to-query-a-XPath-doc-that-has-a-default.aspx EDIT3:找到一篇文章,用一种解决方法描述问题的实际情况,但不是很漂亮的解决方法: http//codeclimber.net.nz/archive/2008/01/09/How-to-query-a-XPath -doc -即,具有-A-的Default.aspx

Almost seems that stripping the xmlns is the way to go... 几乎似乎剥离xmlns是要走的路......

You're missing the whole point of XML namespaces. 您错过了XML命名空间的全部内容。

But if you really need to perform XPath on documents that will use an unknown namespace, and you really don't care about it, you will need to strip it out and reload the document. 但是,如果您确实需要对将使用未知命名空间的文档执行XPath,并且您真的不关心它,则需要将其删除并重新加载文档。 XPath will not work in a namespace-agnostic way, unless you want to use the local-name() function at every point in your selectors. 除非您希望在选择器中的每个点使用local-name()函数,否则XPath将无法以与命名空间无关的方式工作。

private XmlDocument StripNamespace(XmlDocument doc)
{
    if (doc.DocumentElement.NamespaceURI.Length > 0)
    {
        doc.DocumentElement.SetAttribute("xmlns", "");
        // must serialize and reload for this to take effect
        XmlDocument newDoc = new XmlDocument();
        newDoc.LoadXml(doc.OuterXml);
        return newDoc;
    }
    else
    {
        return doc;
    }
}

<myfancynamespace:foo/> is not necessarily the same as <foo/> . <myfancynamespace:foo/>不一定与<foo/>相同。

Namespaces do matter. 命名空间很重要。 But I can understand your frustration as they usually tend to breaks codes as various implementation (C#, Java, ...) tend to output it differently. 但我可以理解你的挫败感,因为它们通常会打破代码,因为各种实现(C#,Java,...)倾向于以不同的方式输出它。

I suggest you change your XPath to allow for accepting all namespaces. 我建议您更改XPath以允许接受所有名称空间。 For example instead of 例如,而不是

//bbb 

Define it as 将其定义为

//*[local-name()='bbb']

That should take care of it. 那应该照顾它。

You should describe a bit more detailed what you want to do. 您应该更详细地描述您想要做的事情。 The way you ask your question it make no sense at all. 你提出问题的方式完全没有意义。 The namespace is just a part of the name. 命名空间只是名称的一部分。 Nothing more, nothing less. 没有更多,没有更少。 So your question is the same as asking for an XPath query to get all tags ending with "x". 因此,您的问题与要求XPath查询以使所有标记以“x”结尾相同。 That's not the idea behind XML, but if you have strange reasons to do so: Feel free to iterate over all nodes and implement it yourself. 这不是XML背后的想法,但如果你有奇怪的理由这样做:随意迭代所有节点并自己实现它。 The same applies to functionality you are requesting. 这同样适用于您请求的功能。

You could use the LINQ XML classes like XDocument . 您可以使用像XDocument这样的LINQ XML类。 They greatly simplify working with namespaces. 它们极大地简化了命名空间的使用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM