简体   繁体   中英

Java DOM Parser reading xml files information - nodes attributes

I have got an xml file and try to read in some information and try to arrange them. The data in the xml looks like:

    <Class code="1-10" kind="category">
        <Meta name="P17b-d" value="2"/>
        <SuperClass code="1-10...1-10"/>
        <SubClass code="1-100"/>
        <Rubric kind="preferred">
            <Label xml:lang="de" xml:space="default">Klinische Untersuchung</Label>
        </Rubric>
    </Class>

and my Java class looks like:

import java.io.File;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;

public class Importer {

   public static void main(String[] args) {

      try {
         File inputFile = new File("ops2022.xml");
         DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
         DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
         Document doc = dBuilder.parse(inputFile);
         doc.getDocumentElement().normalize();
         NodeList nList = doc.getElementsByTagName("Class");
         
         for (int temp = 0; temp < 10; temp++) {
            Node nNode = nList.item(temp);
            System.out.println("\nCurrent Element :" + nNode.getNodeName() );
            Element iElement = (Element) nNode;
            if (nNode.getNodeType() == Node.ELEMENT_NODE && iElement.getAttribute("kind").equals("category")   ) {
               Element eElement = (Element) nNode;
               System.out.println("code : " 
                  + eElement.getAttribute("code"));
               System.out.println("Label : " 
                  + eElement
                  .getElementsByTagName("Label")
                  .item(0)
                  .getTextContent());
               System.out.println("SuperClass : " 
                  + eElement
                  .getElementsByTagName("SuperClass")
                  //I don't know how to get the attribute code here
                  );
            } 
         }
      } catch (Exception e) {
         e.printStackTrace();
      }
   }
}

But how do I get the attribute's information of the "SuperClass" Node? Idon't know why but java handles eElement.getAttributeNode("SuperClass") as a node, although it is an Element. So I can't use the getAttribute().

I added the code in your answer (@Hiran Chaudhuri) to get my needed information:

import java.io.File;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;

public class Importer {

   public static void main(String[] args) {

      try {
         File inputFile = new File("ops2022.xml");
         DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
         DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
         Document doc = dBuilder.parse(inputFile);
         doc.getDocumentElement().normalize();
         NodeList nList = doc.getElementsByTagName("Class");
         
         for (int temp = 0; temp < 10; temp++) {
            Node nNode = nList.item(temp);
            System.out.println("\nCurrent Element :" + nNode.getNodeName() );
            Element iElement = (Element) nNode;
            if (nNode.getNodeType() == Node.ELEMENT_NODE && iElement.getAttribute("kind").equals("category")   ) {
               Element eElement = (Element) nNode;
               System.out.println("code : " 
                  + eElement.getAttribute("code"));
               System.out.println("Label : " 
                  + eElement
                  .getElementsByTagName("Label")
                  .item(0)
                  .getTextContent());
               System.out.println("SuperClass : " 
                  + eElement
                  .getElementsByTagName("SuperClass")
                  Node n = eElement.getElementsByTagName("SuperClass").item(0);
               if (n instanceof Attr) {
                   Attr a = (Attr)n;
                   System.out.println(a.getName());
                   System.out.println(a.getValue());
               } 
                  );
            } 
         }
      } catch (Exception e) {
         e.printStackTrace();
      }
   }
}

And I get the following

----------------------------

Current Element :Class

Current Element :Class

Current Element :Class
code : 1-10
Label : Klinische Untersuchung

and if I add another else clause like

else {
                   Attr a = (Attr)n;
                   System.out.println(a.getValue());
               }

java throws the following error:

java.lang.ClassCastException: class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl cannot be cast to class org.w3c.dom.Attr (com.sun.org.apache.xerces.internal.dom.DeferredElementImpl and org.w3c.dom.Attr are in module java.xml of loader 'bootstrap')
    at Importer.main(Importer.java:46)

.

With Element.getAttributeNode() you do receive a subclass/subinterface of Node called Attr. This Attr has getName() and getValue() methods that you should be interested in.

Using Element.getAttribute() will directly deliver the value of the corresponding attribute.

If you lost the chance to directly obtain the correct type, you can still recover like

Node n = ... // this is the attribute you are interested in
if (n instanceof Attr) {
    Attr a = (Attr)n;
    System.out.println(a.getName());
    System.out.println(a.getValue());
}

So you are wondering how to access the SuperClass' code attribute. This code prints exactly the one value:

Document doc = dBuilder.parse(inputFile);
NodeList nList = doc.getElementsByTagName("Class"); // this list only contains Element nodes
for (int temp = 0; temp < nList.getLength(); temp++) {
    Element nNode = (Element)nList.item(temp); // this is one 'class' element

    NodeList nList2 = nNode.getElementsByTagName("SuperClass"); // this list only contains Element nodes
    for (int temp2 = 0; temp2 < nList2.getLength(); temp2++) {
       Element superclass = (Element)nList2.item(temp2);
       String code = superclass.getAttribute("code");
       System.out.println(code);
    }
}

However this code does the same:

Document doc = dBuilder.parse(inputFile);
XPath xpath = XPathFactory.newInstance().newXPath();
String code = xpath.evaluate("/Class/SuperClass/@code", doc);

With XPath expressions you can navigate the DOM tree much more efficiently.

The following code did the job for me:

for (int i = 0; i < nList.getLength(); i++) {
            //for (int i = 0; i < 20; i++) {
                Node nNode = nList.item(i);
                
                //System.out.println("\nCurrent Element :" + nNode.getNodeName() );
                if (nNode.getNodeType() == Node.ELEMENT_NODE) {
                    Element eElement = (Element) nNode;
                    String supString = "OPS-2022";
                    NodeList fieldNodes = eElement.getElementsByTagName("SuperClass");
                    for(int j = 0; j < fieldNodes.getLength(); j++) {
                        Node fieldNode = fieldNodes.item(j);
                        NamedNodeMap attributes = fieldNode.getAttributes();
                        Node attr = attributes.getNamedItem("code");
                        if(attr != null) {
                            supString =attr.getTextContent();
                        }
                    }
                }
            }

Thanks for your help!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM