简体   繁体   中英

Reading Multi-Level XML files in Java

Until recently, the tag-structure of my XML files were fairly simple. But now I have an extra level of tags with tags, and parsing the XML has become more complicated.

Here is an example of my new XML files (I've changed the tag names to make it easier to understand):

<SchoolRoster>
   <Student>
      <name>John</name>
      <age>14</age>
      <course>
         <math>A</math>
         <english>B</english>
      </course>
      <course>
         <government>A+</government>
      </course>
   </Student>
   <Student>
      <name>Tom</name>
      <age>13</age>
      <course>
         <gym>A</gym>
         <geography>incomplete</geography>
      </course>
   </Student>
</SchoolRoster>

The important features of the XML above is that I can have multiple "course" attributes, and inside that I can have arbitrarily named tags as their children. And there can be any number of these children, which I want to read into a HashMap of "name", "value".

public static TreeMap getAllSchoolRosterInformation(String fileName) {
    TreeMap SchoolRoster = new TreeMap();

    try {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();
        File file = new File(fileName);
        if (file.exists()) {
            Document doc = db.parse(file);
            Element docEle = doc.getDocumentElement();
            NodeList studentList = docEle.getElementsByTagName("Student");

            if (studentList != null && studentList.getLength() > 0) {
                for (int i = 0; i < studentList.getLength(); i++) {

                    Student aStudent = new Student();
                    Node node = studentList.item(i);

                    if (node.getNodeType() == Node.ELEMENT_NODE) {
                        Element e = (Element) node;
                        NodeList nodeList = e.getElementsByTagName("name");
                        aStudent.setName(nodeList.item(0).getChildNodes().item(0).getNodeValue());

                        nodeList = e.getElementsByTagName("age");
                        aStudent.setAge(Integer.parseInt(nodeList.item(0).getChildNodes().item(0).getNodeValue()));

                        nodeList = e.getElementsByTagName("course");
                        if (nodeList != null && nodeList.getLength() > 0) {
                            Course[] courses = new Course[nodeList.getLength()];
                            for (int j = 0; j < nodeList.getLength(); j++) {

                                Course singleCourse = new Course();
                                HashMap classGrades = new HashMap();
                                NodeList CourseNodeList = nodeList.item(j).getChildNodes();

                                for (int k = 0; k < CourseNodeList.getLength(); k++) {
                                    if (CourseNodeList.item(k).getNodeType() == Node.ELEMENT_NODE && CourseNodeList != null) {
                                        classGrades.put(CourseNodeList.item(k).getNodeName(), CourseNodeList.item(k).getNodeValue());
                                    }
                                }
                                singleCourse.setRewards(classGrades);
                                Courses[j] = singleCourse;
                            }
                            aStudent.setCourses(Courses);
                        }
                    }
                    SchoolRoster.put(aStudent.getName(), aStudent);
                }
            }
        } else {
            System.exit(1);
        }
    } catch (Exception e) {
        System.out.println(e);
    }
    return SchoolRoster;
}

The problem I'm running in to is that instead of getting that the student got an "A" in "math", they get a null in "math". (If this post is too long, I can try to find some way of shortening it.)

If this were my project I would avoid trying to manually dissect the data in the HTML and instead let Java do it for me by using JAXB. The more I use this tool, the more I love it. I urge you to consider trying this, for if you do, all you would need to change the XML into Java objects is proper annotations in your Java classes and then unmarshall the XML. The code used would be much simpler and thus much less likely to have errors.

For instance, the following code marshalled information into an XML quite easily and cleanly:

import java.util.ArrayList;
import java.util.List;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Marshaller;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.XmlType;

@XmlRootElement
public class SchoolRoster {
   @XmlElement(name = "student")
   private List<Student> students = new ArrayList<Student>();

   public SchoolRoster() {
   }

   public List<Student> getStudents() {
      return students;
   }

   public void addStudent(Student student) {
      students.add(student);
   }

   public static void main(String[] args) {
      Student john = new Student("John", 14);
      john.addCourse(new Course("math", "A"));
      john.addCourse(new Course("english", "B"));

      Student tom = new Student("Tom", 13);
      tom.addCourse(new Course("gym", "A"));
      tom.addCourse(new Course("geography", "incomplete"));

      SchoolRoster roster = new SchoolRoster();
      roster.addStudent(tom);
      roster.addStudent(john);

      try {
         JAXBContext context = JAXBContext.newInstance(SchoolRoster.class);
         Marshaller marshaller = context.createMarshaller();
         marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);

         String pathname = "MySchoolRoster.xml";
         File rosterFile = new File(pathname );
         marshaller.marshal(roster, rosterFile);
         marshaller.marshal(roster, System.out);

      } catch (JAXBException e) {
         e.printStackTrace();
      }
   }
}

@XmlRootElement
@XmlType(propOrder = { "name", "age", "courses" })
class Student {
  // TODO: completion left as an exercise for the original poster
}

@XmlRootElement
@XmlType(propOrder = { "name", "grade" })
class Course {
  // TODO: completion left as an exercise for the original poster
}

This resulted in the following XML:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<schoolRoster>
    <student>
        <name>Tom</name>
        <age>13</age>
        <courses>
            <course>
                <name>gym</name>
                <grade>A</grade>
            </course>
            <course>
                <name>geography</name>
                <grade>incomplete</grade>
            </course>
        </courses>
    </student>
    <student>
        <name>John</name>
        <age>14</age>
        <courses>
            <course>
                <name>math</name>
                <grade>A</grade>
            </course>
            <course>
                <name>english</name>
                <grade>B</grade>
            </course>
        </courses>
    </student>
</schoolRoster>

To unmarshall this into a SchoolRoster class filled with data would only take a few lines of code.

private static void unmarshallTest() {
  try {
     JAXBContext context = JAXBContext.newInstance(SchoolRoster.class);
     Unmarshaller unmarshaller = context.createUnmarshaller();

     String pathname = "MySchoolRoster.xml"; // whatever the file name should be
     File rosterFile = new File(pathname );
     SchoolRoster roster = (SchoolRoster) unmarshaller.unmarshal(rosterFile);
     System.out.println(roster);
  } catch (JAXBException e) {
     e.printStackTrace();
  }
}

After adding toString() methods to my classes, this resulted in:

SchoolRoster 
  [students=
    [Student [name=Tom, age=13, courses=[Course [name=gym, grade=A], Course [name=geography, grade=incomplete]]], 
    Student [name=John, age=14, courses=[Course [name=math, grade=A], Course [name=english, grade=B]]]]]
for (int k = 0; k < CourseNodeList.getLength(); k++) {
    if (CourseNodeList.item(k).getNodeType() == Node.ELEMENT_NODE && CourseNodeList != null) {
        classGrades.put(CourseNodeList.item(k).getNodeName(),
        CourseNodeList.item(k).getNodeValue());
    }
}

You're calling getNodeValue() on an Element . According to the JDK API docs, that returns null.

http://docs.oracle.com/javase/6/docs/api/org/w3c/dom/Node.html

You need to get the child Text node and call getNodeValue() on that. Here's a really quick and dirty way to do so:

classGrades.put(CourseNodeList.item(k).getNodeName(),
        CourseNodeList.item(k).getChildNodes().item(0).getNodeValue());

Please don't use this in production code. It's ugly. But it'll point you in the right direction.

Like @Hovercraft I recommend using a library to handles the serialization to xml. I found Xstream to be have great performance and easy to use. http://x-stream.github.io/

For example:

public static void saveStudentsXML(FileOutputStream file) throws Exception {
    if (xstream == null)
        initXstream();

    xstream.toXML(proctorDAO.studentList, file);
    file.close(); 
 }

 public static void initXstream() {
    xstream = new XStream();
    xstream.alias("student", Student.class);

    xstream.useAttributeFor(Student.class, "lastName");
    xstream.useAttributeFor(Student.class, "firstName");
    xstream.useAttributeFor(Student.class, "id");
    xstream.useAttributeFor(Student.class, "gradYear");
    xstream.aliasAttribute(Student.class, "lastName", "last");
    xstream.aliasAttribute(Student.class, "gradYear", "gc");
    xstream.aliasAttribute(Student.class, "firstName", "first");
}

Sample XML to demonstrate a nested property:

<list>
  <student first="Ralf" last="Adams" gc="2014" id="100">
     <testingMods value="1" boolMod="2"/>
  </student>
  <student first="Mick" last="Agosti" gc="2014" id="102">
     <testingMods value="1" boolMod="2"/>
  </student>
  <student first="Edmund" last="Baggio" gc="2013" id="302">
     <testingMods value="1" boolMod="6"/>
  </student> 
</list>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM