How to extract specific tag content in XML file if there are other tags with the same name inside another tag in Java?

Question

Currently, I'm working in parsing XML files in Java using DOM. But I have faced a problem in how to extract specific tag content from XML file if there are other tags with the same name inside another tag as the following scenario :

<file>
    <sub-file>
        <a> ....</a>
        <b> ....</b>
        <c> ....</c>
    </sub-file>

    <a> ..... some data here ....</a>
    <b> ..... some data here ....</b>
    <c> ..... some data here ....</c>

    <image>
        <a> ....</a>
        <b> ....</b>
        <c> ....</c>
    </image>
</file>

So how could I extract a,b,c tags that aren't inside another (inside sub-file or image)? I tried so far this code:

    File xmlfile=new File(path);
            factory = DocumentBuilderFactory.newInstance();
            builder=  factory.newDocumentBuilder();
            document= builder.parse(xmlfile);
            document.getDocumentElement().normalize();
            filelist= document.getElementsByTagName("file");
            for(int o=0;o<filelist.getLength();o++)
            {
                Node nNode = filelist.item(o);

                if (nNode.getNodeType() == Node.ELEMENT_NODE)
                {

                    Element element = (Element) nNode;
                        for (int a=0; a<element.getElementsByTagName("file").getLength(); a++)
                    {   

                            tagA=element.getElementsByTagName("a").item(a).getTextContent();

                            tagB=element.getElementsByTagName("b").item(a).getTextContent();

                            tagC=element.getElementsByTagName("c").item(a).getTextContent();

                    }       
                }
            }// loop
        }

This code print all the tags a,b,c 3 times (inside file, sub-file and image).

Answer 1

Don't use getElementsByTagName() . Instead, navigate the DOM tree yourself:

Node fileNode = filelist.item(o);
for (Node child = fileNode.getFirstChild(); child != null; child = child.getNextSibling()) {
    if (child.getNodeType() == Node.ELEMENT_NODE) {
        switch (child.getNodeName()) {
            case "a":
                tagA = child.getTextContent();
                break;
            case "b":
                tagB = child.getTextContent();
                break;
            case "c":
                tagC = child.getTextContent();
                break;
            default:
                // ignore
        }
    }
}

As an alternative, you can also look into using XPath:

XPathFactory xpathFactory = XPathFactory.newInstance();
XPath xpath = xpathFactory.newXPath();

tagA = xpath.evaluate("a", fileNode);
tagB = xpath.evaluate("b", fileNode);
tagC = xpath.evaluate("c", fileNode);

Answer 2

Element.getElementsByTagName(String) returns all descendant nodes with with the provided tag name, not just the immediate children. You can navigate the tree by using getChildNodes() and iterating on the returned NodeList or using getFirstChild() and iterating using getNextSibling() .

If you are not limited to using just DOM, you can also use XPath to select the appropriate nodes, ie //file/a .

How to extract specific tag content in XML file if there are other tags with the same name inside another tag in Java?

Question

2 answers

solution1
1 ACCPTED 2018-07-02 21:19:21

solution2
0 2018-07-02 21:23:25

How to extract specific tag content in XML file if there are other tags with the same name inside another tag in Java?

Question

2 answers

solution1 1 ACCPTED 2018-07-02 21:19:21

solution2 0 2018-07-02 21:23:25

solution1
1 ACCPTED 2018-07-02 21:19:21

solution2
0 2018-07-02 21:23:25