简体   繁体   中英

Java DOM parser returns null document

I have an HTML template which I want to read in:

<html>
   <head>
      <title>TEST</title>
   </head>
   <body>
      <h1 id="hey">Hello, World!</h1>
   </body>
</html>

I want find the tag with the id hey and then paste in new stuff (eg new tags). For this purpose I use the DOM parser. But my code returns me null :

public static void main(String[] args) {

    try {
        File file = new File("C:\\Users\\<username>\\Desktop\\template.html");
        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(file);
        doc.getDocumentElement().normalize();

        System.out.println(doc.getElementById("hey")); // returns null

    } catch (Exception e) {
        e.printStackTrace();
    }

}

What am I doing wrong?

You are trying to parse a piece of XML with the Java XML API, that is very compliant with the XML specification and doesn't help the casual developer.

In XML an attribute named id is not automatically of ID type, and thus the XML implementation doesn't get it with .getElementById() . Either you use another library (Jsoup for example), or instruct the parser to treat id as an ID (via the DTD) or you use custom code.

I modified your example to using jsoup

public static void main(String[] args) {
        try {
            File file = new File("C:\\Users\\<username>\\Desktop\\template.html");
            Document doc = Jsoup.parse(file, "UTF8");          
            Element elementById = doc.getElementById("hey");
            System.out.println("hey ="+doc.getElementById("hey").ownText());
            System.out.println("hey ="+doc.getElementById("hey"));

        } catch (Exception e) {
            e.printStackTrace();
        }
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM