Java DOM parser returns null document

Question

I have an HTML template which I want to read in:

<html>
   <head>
      <title>TEST</title>
   </head>
   <body>
      <h1 id="hey">Hello, World!</h1>
   </body>
</html>

I want find the tag with the id hey and then paste in new stuff (eg new tags). For this purpose I use the DOM parser. But my code returns me null :

public static void main(String[] args) {

    try {
        File file = new File("C:\\Users\\<username>\\Desktop\\template.html");
        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(file);
        doc.getDocumentElement().normalize();

        System.out.println(doc.getElementById("hey")); // returns null

    } catch (Exception e) {
        e.printStackTrace();
    }

}

What am I doing wrong?

Answer 1

You are trying to parse a piece of XML with the Java XML API, that is very compliant with the XML specification and doesn't help the casual developer.

In XML an attribute named id is not automatically of ID type, and thus the XML implementation doesn't get it with .getElementById() . Either you use another library (Jsoup for example), or instruct the parser to treat id as an ID (via the DTD) or you use custom code.

Answer 2

I modified your example to using jsoup

public static void main(String[] args) {
        try {
            File file = new File("C:\\Users\\<username>\\Desktop\\template.html");
            Document doc = Jsoup.parse(file, "UTF8");          
            Element elementById = doc.getElementById("hey");
            System.out.println("hey ="+doc.getElementById("hey").ownText());
            System.out.println("hey ="+doc.getElementById("hey"));

        } catch (Exception e) {
            e.printStackTrace();
        }
    }

Java DOM parser returns null document

Question

2 answers

solution1
4 ACCPTED 2016-02-16 14:27:08

solution2
2 2016-02-16 14:49:03

Java DOM parser returns null document

Question

2 answers

solution1 4 ACCPTED 2016-02-16 14:27:08

solution2 2 2016-02-16 14:49:03

solution1
4 ACCPTED 2016-02-16 14:27:08

solution2
2 2016-02-16 14:49:03