繁体   English   中英

使用 Java 解析多级 XML 文件 - Dom Parser

[英]Parsing the multilevel XML File with Java - Dom Parser

我有这个包含 3 个类别的 xml 文件:employee_list、position_details 和 employee_info。

<?xml version="1.0" encoding="UTF-8"?>
<employee>
    <employee_list>
        <employee ID="1">
            <firstname>Andrei</firstname>
            <lastname>Rus</lastname>
            <age>23</age>
            <position-skill ref="Java"/>
            <detail-ref ref="AndreiR"/>
        </employee>

        <employee ID="2">
            <firstname>Ion</firstname>
            <lastname>Popescu</lastname>
            <age>25</age>
            <position-skill ref="Python"/>
            <detail-ref ref="IonP"/>
        </employee>

        <employee ID="3">
            <firstname>Georgiana</firstname>
            <lastname>Domide</lastname>
            <age>33</age>
            <position-skill ref="C"/>
            <detail-ref ref="GeorgianaD"/>
        </employee>

    </employee_list>

    <position_details>
        <position ID="Java">
            <role>Junior Developer</role>
            <skill_name>Java</skill_name>
            <experience>1</experience><!-- years of experience -->
        </position>

        <position ID="Python">
            <role>Developer</role>
            <skill_name>Python</skill_name>
            <experience>3</experience> 
        </position>

        <position ID="C">
            <role>Senior Developer</role>
            <skill_name>C</skill_name>
            <experience>5</experience>
        </position>
    </position_details>

    <employee_info>
        <detail ID="AndreiR">
            <username>AndreiR</username>
            <residence>Timisoara</residence>
            <yearOfBirth>1999</yearOfBirth>
            <phone>0</phone>
        </detail>

        <detail ID="IonP">
            <username>IonP</username>
            <residence>Timisoara</residence>
            <yearOfBirth>1997</yearOfBirth>
            <phone>0</phone>
        </detail>

        <detail ID="GeorgianaD">
            <username>GeorgianaD</username>
            <residence>Arad</residence>
            <yearOfBirth>1989</yearOfBirth>
            <phone>0</phone>
        </detail>
    </employee_info>
</employee>

我想为所有 3 个类别编写 java 代码,但到目前为止,我只设法通过了第一个类别(employee_list)。 当我尝试从 position_list 或 employee_info 类别中检索信息时,程序无法根据每个类别查找信息。

我为 3 个类别编写了 Java 代码,结果如下所示:

package Dom;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import java.io.File;
import java.io.IOException;
import java.util.Scanner;

public class main {

    public static void main(String[] args) {
        try {
            File xmlDoc = new File("employees.xml");
            DocumentBuilderFactory dbFact = DocumentBuilderFactory.newInstance();
            DocumentBuilder dBuild = dbFact.newDocumentBuilder(); 
            Document doc = dBuild.parse(xmlDoc);
            
            //Citim radacina
            //                                    doc localizeaza radacina   da numele ei
            System.out.println("Root element: " + doc.getDocumentElement().getNodeName());
            System.out.println("-----------------------------------------------------------------------------");
            
            //citim un array de studenti pe care il denumim NodeList
            NodeList nList = doc.getElementsByTagName("employee");
            System.out.println("Total Category inside = " + nList.getLength());
            System.out.println("-----------------------------------------------------");
            
            
            for(int i = 0 ; i<nList.getLength();i++) {
                Node nNode = nList.item(i);
                //System.out.println("Node name: " + nNode.getNodeName()+" " + (i+1));
                if(nNode.getNodeType() == Node.ELEMENT_NODE) {
                    Element eElement = (Element) nNode;
                    System.out.println("Person id#: " + eElement.getAttribute("id"));
                    System.out.println("Person Last Name: " + eElement.getElementsByTagName("lastname").item(0).getTextContent());
                    System.out.println("Person First name: " + eElement.getElementsByTagName("firstname").item(0).getTextContent());
                    System.out.println("Person Age: " + eElement.getElementsByTagName("age").item(0).getTextContent());
                    System.out.println("--------------------------------------------------------------------------");
                }
            }
            
            System.out.println("=============================================================================================");
            
            nList = doc.getElementsByTagName("position");
            System.out.println("Total Category inside = " + nList.getLength());
            System.out.println("-----------------------------------------------------");
            for(int i = 0 ; i<nList.getLength();i++) {
                Node nNode = nList.item(i);
                //System.out.println("Node name: " + nNode.getNodeName()+" " + (i+1));
                if(nNode.getNodeType() == Node.ELEMENT_NODE) {
                    Element eElement = (Element) nNode;
                    System.out.println("Role: " + eElement.getElementsByTagName("role").item(0).getTextContent());
                    System.out.println("Skill: "+ eElement.getElementsByTagName("skill_name").item(0).getTextContent());
                    System.out.println("Experience: "+ eElement.getElementsByTagName("experience").item(0).getTextContent());
                    System.out.println("--------------------------------------------------------------------------");
                }
            }
            
            System.out.println("=============================================================================================");
            
            nList = doc.getElementsByTagName("detail");
            System.out.println("Total Category inside = " + nList.getLength());
            System.out.println("-----------------------------------------------------");
            for(int i = 0 ; i<nList.getLength();i++) {
                Node nNode = nList.item(i);
                //System.out.println("Node name: " + nNode.getNodeName()+" " + (i+1));
                if(nNode.getNodeType() == Node.ELEMENT_NODE) {
                    Element eElement = (Element) nNode;
                    System.out.println("Person with username: " +  eElement.getElementsByTagName("username").item(0).getTextContent());
                    System.out.println("Username: " + eElement.getElementsByTagName("username").item(0).getTextContent());
                    System.out.println("Residence: "+ eElement.getElementsByTagName("residence").item(0).getTextContent());
                    System.out.println("Year of birth: "+ eElement.getElementsByTagName("yearOfBirth").item(0).getTextContent());
                    System.out.println("Phone: "+ eElement.getElementsByTagName("phone").item(0).getTextContent());
                    System.out.println("--------------------------------------------------------------------------");
                }
            }
            
            
        }catch(Exception e) {
            
        }
        
    }

}

输出:

Root element: employee
-----------------------------------------------------------------------------
Total Category inside = 4
-----------------------------------------------------
Person id#: 
Person Last Name: Rus
Person First name: Andrei
Person Age: 23
--------------------------------------------------------------------------
Person id#: 
Person Last Name: Rus
Person First name: Andrei
Person Age: 23
--------------------------------------------------------------------------
Person id#: 
Person Last Name: Popescu
Person First name: Ion
Person Age: 25
--------------------------------------------------------------------------
Person id#: 
Person Last Name: Domide
Person First name: Georgiana
Person Age: 33
--------------------------------------------------------------------------
=============================================================================================
Total Category inside = 3
-----------------------------------------------------
Role: Junior Developer
Skill: Java
Experience: 1
--------------------------------------------------------------------------
Role: Developer
Skill: Python
Experience: 3
--------------------------------------------------------------------------
Role: Senior Developer
Skill: C
Experience: 5
--------------------------------------------------------------------------
=============================================================================================
Total Category inside = 3
-----------------------------------------------------
Person with username: AndreiR
Username: AndreiR
Residence: Timisoara
Year of birth: 1999
Phone: 0
--------------------------------------------------------------------------
Person with username: IonP
Username: IonP
Residence: Timisoara
Year of birth: 1997
Phone: 0
--------------------------------------------------------------------------
Person with username: GeorgianaD
Username: GeorgianaD
Residence: Arad
Year of birth: 1989
Phone: 0
--------------------------------------------------------------------------

是否有可能将输出稍微分组,每个人都采用以下形式:

PersonId
firstname
lastname
age
role
skill_name
experience
username
residence
yearOfBirth
phone

但是你基本上已经完成了,你写了三个类别的代码。 “未能按类别查找信息”可能意味着输出不是想要的输出。 原因可能是Document.getElementsByTagName()全局搜索所有将名称作为参数传递的元素。 由于您的根元素也被命名为employee ,因此它作为附加Node包含在doc.getElementsByTagName("employee")上的NodeList中,顺便说一句。 没有自己的 ID 属性(注意:区分大小写)。 因此,“里面的总类别 = 4”。 如果您然后在作为根的第一个Node / Element上执行getElementsByTagName("lastname") ,果然,它下面有一个<lastname/>元素,只是不是直接子元素,而是向下两层的元素第一个“实际”/期望<employee/>

因此,您可能想要做的是不要在全局范围内搜索您的元素名称,而是在类别的本地上下文中搜索,因为您已经在循环内的其他地方成功地进行了搜索。 也许只是改变

NodeList nList = doc.getElementsByTagName("employee");

NodeList nList = doc.getElementsByTagName("employee_list");
nList = ((Element)nList.item(0)).getElementsByTagName("employee");

对于 employee_list 类别,对于其他类别也是如此。

为了更好地对记录进行分组,您不需要立即输出/打印它们。 您可以将从 DOM 获得的值复制/存储在您可以定义的类的对象的成员中,或者创建一个更通用的类,将字段值存储在包含MapList或类似的东西中。 这样,您可以迭代/循环您创建的对象或列表,并以您喜欢的顺序输出/打印字段值。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM