简体   繁体   中英

How to specify namespace when querying nodes with XPath?

Short Version

  • You do it in .NET with:

     XmlNode.SelectNodes(query, selectionNamespaces); 
  • Can you do it in javascript?

  • Can you do it in msxml?

Attempt A :

IXMLDOMNode.selectNodes(query); //no namespaces option

Attempt B :

IXMLDOMNode.ownerDocument.setProperty("SelectionNamespaces", selectionNamespaces);
IXMLDOMNode.selectNodes(query); //doesn't work

Attempt C :

IXMLDOMDocument3 doc;
doc.setProperty("SelectionNamespaces", selectionNamespaces);
IXMLDOMNodeList list = doc.selectNodes(...)[0].selectNodes(query); //doesn't work

Long Version

Given an IXMLDOMNode containing a fragment of xml:

<row>
    <cell>a</cell>
    <cell>b</cell>
    <cell>c</cell>
</row>

We can use the IXMLDOMNode.selectNodes method to select child elements:

IXMLDOMNode row = //...xml above

IXMLDOMNodeList cells = row.selectNodes("/row/cell");

and that will return an IXMLDOMNodeList :

  • <cell>a</cell>
  • <cell>b</cell>
  • <cell>c</cell>

And that's fine.

But namespaces break it

If the XML fragment originated from a document with a namespace, eg:

<row xmlns:ss="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
    <cell>a</cell>
    <cell>b</cell>
    <cell>c</cell>
</row>

The same XPath query will nothing, because the elements row and cell do not exist; they are in another namespace.

Querying documents with default namespace

If you had a full IXMLDOMDocument , you would use the setProperty method to set a selection namespace :

abc

You would query the default namespace by giving it a name, eg:

  • Before : xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
  • After : xmlns:peanut="http://schemas.openxmlformats.org/spreadsheetml/2006/main"

and then you can query it:

IXMLDOMDocument3 doc = //...document xml above
doc.setProperty("SelectionNamespaces", "xmlns:peanut="http://schemas.openxmlformats.org/spreadsheetml/2006/main");

IXMLDOMNodeList cells = doc.selectNodes("/peanut:row/peanut:cell");

and you get your cells:

  • <cell>a</cell>
  • <cell>b</cell>
  • <cell>c</cell>

But that doesn't work for a node

An IXMLDOMNode has a method to perform XPath queries:

selectNodes Method

Applies the specified pattern-matching operation to this node's context and returns the list of matching nodes as IXMLDOMNodeList .

 HRESULT selectNodes( BSTR expression, IXMLDOMNodeList **resultList); 

Remarks

For more information about using the selectNodes method with namespaces, see the setProperty Method topic.

But there's no way to specify Selection Namespaces when issuing an XPath query against a DOM Node.

How can I specify a namespace when querying nodes with XPath?

.NET Solution

.NET's XmlNode provides a SelectNodes method that provides accepts a XmlNamespaceManager parameter:

XmlNamespaceManager ns = new XmlNamespaceManager(doc.NameTable);
ns.AddNamespace("peanut", "http://schemas.openxmlformats.org/spreadsheetml/2006/main");
cells = row.SelectNodes("/peanut:row/peanut:cell", ns);

But i'm not in C# (nor am i in Javascript). What's the native msxml6 equivalent?

Edit : Me not so much with the Javascript ( jsFiddle )

Complete Minimal Example

program Project3;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  System.SysUtils, msxml, ActiveX;

procedure Main;
var
    s: string;
    doc: DOMDocument60;
    rows: IXMLDOMNodeList;
    row: IXMLDOMElement;
    cells: IXMLDOMNodeList;
begin
    s :=
            '<?xml version="1.0" encoding="UTF-16" standalone="yes"?>'+#13#10+
            '<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">'+#13#10+
            '<row>'+#13#10+
            '    <cell>a</cell>'+#13#10+
            '    <cell>b</cell>'+#13#10+
            '    <cell>c</cell>'+#13#10+
            '</row>'+#13#10+
            '</worksheet>';

    doc := CoDOMDocument60.Create;
    doc.loadXML(s);
    if doc.parseError.errorCode <> 0 then
        raise Exception.CreateFmt('Parse error: %s', [doc.parseError.reason]);

    doc.setProperty('SelectionNamespaces', 'xmlns:ss="http://schemas.openxmlformats.org/spreadsheetml/2006/main"');

    //Query for all the rows
    rows := doc.selectNodes('/ss:worksheet/ss:row');
    if rows.length = 0 then
        raise Exception.Create('Could not find any rows');

    //Do stuff with the first row
    row := rows[0] as IXMLDOMElement;

    //Get the cells in the row
    (row.ownerDocument as IXMLDOMDocument3).setProperty('SelectionNamespaces', 'xmlns:ss="http://schemas.openxmlformats.org/spreadsheetml/2006/main"');
    cells := row.selectNodes('/ss:row/ss:cell');
    if cells.length <> 3 then
        raise Exception.CreateFmt('Did not find 3 cells in the first row (%d)', [cells.length]);
end;

begin
  try
        CoInitialize(nil);
        Main;
  except
    on E: Exception do
      Writeln(E.ClassName, ': ', E.Message);
  end;
end.

This is answered on MSDN:

How To Specify Namespace when Querying the DOM with XPath

Update:

Note, however, that in your second example XML, the <row> and <cell> elements are NOT in the namespace being queried by the XPath when adding xmlns:peanut to the SelectionNamespaces property. That is why the <cell> elements are not being found.

To put them into the namespace properly, you would have to either:

  • change the namespace declaration to use xmlns= instead of xmlns:ss= :

     <row xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"> <cell>a</cell> <cell>b</cell> <cell>c</cell> </row> 
  • use <ss:row> and <ss:cell> instead of <row> and <cell> :

     <ss:row xmlns:ss="http://schemas.openxmlformats.org/spreadsheetml/2006/main"> <ss:cell>a</cell> <ss:cell>b</cell> <ss:cell>c</cell> </ss:row> 

The SelectionNamespaces property does not magically put elements into a namespace for you, it only specifies which namespaces are available for the XPath query to use. The XML itself has to put elements into the proper namespaces as needed.

Update:

In your new example, cells := row.selectNodes('/ss:row/ss:cell'); does not work because the XPath query is using an absolute path, where the leading / starts at the document root, and there are no <row> elements at the top of the XML document, only a <worksheet> element. That is why rows := doc.selectNodes('/ss:worksheet/ss:row'); works.

If you want to perform an XPath query that begins at the node being queried, don't use an absolute path, use a relative path instead:

cells := row.selectNodes('ss:row/ss:cell');

Or simply:

cells := row.selectNodes('ss:cell');

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM