如果Java的另一个标签内还有其他同名标签，如何提取XML文件中的特定标签内容？

Question

Currently, I'm working in parsing XML files in Java using DOM. 目前，我正在使用DOM解析Java中的XML文件。 But I have faced a problem in how to extract specific tag content from XML file if there are other tags with the same name inside another tag as the following scenario : 但是，如果在其他标签内有其他名称相同的其他标签，则我遇到了以下问题：如何从XML文件中提取特定的标签内容：

<file>
    <sub-file>
        <a> ....</a>
        <b> ....</b>
        <c> ....</c>
    </sub-file>

    <a> ..... some data here ....</a>
    <b> ..... some data here ....</b>
    <c> ..... some data here ....</c>

    <image>
        <a> ....</a>
        <b> ....</b>
        <c> ....</c>
    </image>
</file>

So how could I extract a,b,c tags that aren't inside another (inside sub-file or image)? 那么，如何提取不在另一个文件内（在子文件或图像内）的a，b，c标签呢？ I tried so far this code: 到目前为止，我尝试了以下代码：

    File xmlfile=new File(path);
            factory = DocumentBuilderFactory.newInstance();
            builder=  factory.newDocumentBuilder();
            document= builder.parse(xmlfile);
            document.getDocumentElement().normalize();
            filelist= document.getElementsByTagName("file");
            for(int o=0;o<filelist.getLength();o++)
            {
                Node nNode = filelist.item(o);

                if (nNode.getNodeType() == Node.ELEMENT_NODE)
                {

                    Element element = (Element) nNode;
                        for (int a=0; a<element.getElementsByTagName("file").getLength(); a++)
                    {   

                            tagA=element.getElementsByTagName("a").item(a).getTextContent();

                            tagB=element.getElementsByTagName("b").item(a).getTextContent();

                            tagC=element.getElementsByTagName("c").item(a).getTextContent();

                    }       
                }
            }// loop
        }

This code print all the tags a,b,c 3 times (inside file, sub-file and image). 此代码将a，b，c的所有标签打印3次（在文件，子文件和图像中）。

Answer 1

Don't use getElementsByTagName() . 不要使用getElementsByTagName() 。 Instead, navigate the DOM tree yourself: 而是自己浏览DOM树：

Node fileNode = filelist.item(o);
for (Node child = fileNode.getFirstChild(); child != null; child = child.getNextSibling()) {
    if (child.getNodeType() == Node.ELEMENT_NODE) {
        switch (child.getNodeName()) {
            case "a":
                tagA = child.getTextContent();
                break;
            case "b":
                tagB = child.getTextContent();
                break;
            case "c":
                tagC = child.getTextContent();
                break;
            default:
                // ignore
        }
    }
}

As an alternative, you can also look into using XPath: 另外，您也可以考虑使用XPath：

XPathFactory xpathFactory = XPathFactory.newInstance();
XPath xpath = xpathFactory.newXPath();

tagA = xpath.evaluate("a", fileNode);
tagB = xpath.evaluate("b", fileNode);
tagC = xpath.evaluate("c", fileNode);

Answer 2

Element.getElementsByTagName(String) returns all descendant nodes with with the provided tag name, not just the immediate children. Element.getElementsByTagName(String)返回具有提供的标签名称的所有后代节点，而不仅仅是直接子节点。 You can navigate the tree by using getChildNodes() and iterating on the returned NodeList or using getFirstChild() and iterating using getNextSibling() . 您可以使用getChildNodes()并在返回的NodeList上进行迭代，或者使用getFirstChild()并使用getNextSibling()迭代来导航树。

If you are not limited to using just DOM, you can also use XPath to select the appropriate nodes, ie //file/a . 如果您不仅限于使用DOM，还可以使用XPath选择适当的节点，即//file/a 。

如果Java的另一个标签内还有其他同名标签，如何提取XML文件中的特定标签内容？

问题描述

2 个解决方案

解决方案1
1 已采纳 2018-07-02 21:19:21

解决方案2
0 2018-07-02 21:23:25

如果Java的另一个标签内还有其他同名标签，如何提取XML文件中的特定标签内容？

问题描述

2 个解决方案

解决方案1 1 已采纳 2018-07-02 21:19:21

解决方案2 0 2018-07-02 21:23:25

解决方案1
1 已采纳 2018-07-02 21:19:21

解决方案2
0 2018-07-02 21:23:25