JSoup Extract Text by id

Question

I want to extract the text "Inbox (100)" html by id enclosed within tags. My test case looks like this:

    String html = "<td id=\"e-mailoutline-row\" title=\"Inbox\" class=\"outline-text\">Inbox (100)</td>";

    Document doc = Jsoup.parse(html);
    Element numberofEmails = doc.getElementById("e-mailoutline-row");

The issue is that numberofEmails is always null, so I can't even get the text, let alone work towards the actuall number in the brackets.

I also tried :

        String html = "<head><body><td id=\"e-mailoutline-row\" title=\"Inbox\" class=\"outline-text\">Inbox (100)</td></body?</head>";

Once I get the test case working I will use it to extract this text from a much larger document.

This should be simple. What am I missing?

Answer 1

The syntax of the commands was correct, but it appears JSoup is picky about the html being correctly formed. The following html test case worked exactly as intended:

String html = "<head><body><table><tr><td id=\"e-mailoutline-row\">Inbox (100)</td></tr></table></body></head>";

Note I had to add not only and but and too. It did not work with head and body only added to the original test case.

Thanks to @soorapadman and @Yaroslav for pointing me in the right direction.

Answer 2

Jsoup always follows hierarchy. In order parse from td tag it should come from table->tr-->td

 String html = "<head><body><table><tr><td id=\"e-mailoutline-row\">Inbox (100)</td></tr></table></body></head>";
    Document doc = Jsoup.parse(html);
    Element numberofEmails = doc.getElementById("e-mailoutline-row");
    System.out.println(numberofEmails.text());

Output:

Inbox (100)

JSoup Extract Text by id

Question

2 answers

solution1
0 2019-03-05 08:53:07

solution2
-1 2019-03-05 08:14:42

JSoup Extract Text by id

Question

2 answers

solution1 0 2019-03-05 08:53:07

solution2 -1 2019-03-05 08:14:42

solution1
0 2019-03-05 08:53:07

solution2
-1 2019-03-05 08:14:42