简体   繁体   中英

Using xpath and vtd-xml to get sub nodes and text of an element as a string

This is a portion of my XML:

<MAIN>
    <L>
        <D>string1 string2 <b>string3</b> string4</D>
    </L>
    <L>
        <D>string5 string6 <b>string7</b> string8 <i>string9</i></D>
    </L>
</MAIN>

I want to get the content of all the <D> tags as string. So, the example above should return:

1st iteration: 'string1 string2 <b>string3</b> string4'
2nd iteration: 'string5 string6 <b>string7</b> string8 <i>string9</i>'
etc...

In vtd-xml I used an AutoPilot with XPath "//L/D" and "//L/D/text()" but that did not work.

Any advice or alternative approach will be appreciated.

Regards

Below is the code that does what you are looking for.

    VTDGen vg =  new VTDGen();
    if (vg.parseFile("c://xml//alex.txt", true)){
        VTDNav vn = vg.getNav();
        AutoPilot ap = new AutoPilot(vn);
        ap.selectXPath("//L/D");
        int i=-1;
        while((i=ap.evalXPath())!=-1){
            long l = vn.getContentFragment();
            System.out.println(" -==> "+ vn.toString((int )l, (int)(l>>32)));
        }
    }

Use :

/*/L/D/node()

This selects all nodes (elements, text-nodes, processing-instructions and comment-nodes) that are children of any D element that is a child of any L element that is a child of the top element of the XML document.

Alternatively, you could select separately all the node-children of the two /*/L/D elements :

/*/L[1]/D/node()

and

/*/L[2]/D/node()

Verification using XSLT as host of XPath :

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:copy-of select="/*/L[1]/D/node()"/>
--------------------
  <xsl:copy-of select="/*/L[2]/D/node()"/>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document :

<MAIN>
    <L>
        <D>string1 string2 
            <b>string3</b> string4
        </D>
    </L>
    <L>
        <D>string5 string6 
            <b>string7</b> string8 
            <i>string9</i>
        </D>
    </L>
</MAIN>

the wanted, correct result is produced :

string1 string2 
            <b>string3</b> string4

--------------------
  string5 string6 
            <b>string7</b> string8 
            <i>string9</i>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM