简体   繁体   English

JSoup按ID提取文本

[英]JSoup Extract Text by id

I want to extract the text "Inbox (100)" html by id enclosed within tags. 我想通过标签内包含的ID提取文本“ Inbox(100)” html。 My test case looks like this: 我的测试用例如下所示:

    String html = "<td id=\"e-mailoutline-row\" title=\"Inbox\" class=\"outline-text\">Inbox (100)</td>";

    Document doc = Jsoup.parse(html);
    Element numberofEmails = doc.getElementById("e-mailoutline-row");

The issue is that numberofEmails is always null, so I can't even get the text, let alone work towards the actuall number in the brackets. 问题是numberofEmails始终为null,所以我什至无法获取文本,更不用说计算括号中的实际数字了。

I also tried : 我也尝试过:

        String html = "<head><body><td id=\"e-mailoutline-row\" title=\"Inbox\" class=\"outline-text\">Inbox (100)</td></body?</head>";

Once I get the test case working I will use it to extract this text from a much larger document. 一旦测试用例能够正常工作,我将使用它从更大的文档中提取文本。

This should be simple. 这应该很简单。 What am I missing? 我想念什么?

The syntax of the commands was correct, but it appears JSoup is picky about the html being correctly formed. 这些命令的语法是正确的,但是JSoup似乎对正确形成html感到挑剔。 The following html test case worked exactly as intended: 以下html测试用例完全按预期工作:

String html = "<head><body><table><tr><td id=\"e-mailoutline-row\">Inbox (100)</td></tr></table></body></head>";

Note I had to add not only and but and too. 请注意,我不仅必须添加,而且还必须添加。 It did not work with head and body only added to the original test case. 仅将头和身体添加到原始测试用例中时,该方法不起作用。

Thanks to @soorapadman and @Yaroslav for pointing me in the right direction. 感谢@soorapadman和@Yaroslav为我指出正确的方向。

Jsoup always follows hierarchy. Jsoup始终遵循层次结构。 In order parse from td tag it should come from table->tr-->td 为了从td标签解析,它应该来自table->tr-->td

 String html = "<head><body><table><tr><td id=\"e-mailoutline-row\">Inbox (100)</td></tr></table></body></head>";
    Document doc = Jsoup.parse(html);
    Element numberofEmails = doc.getElementById("e-mailoutline-row");
    System.out.println(numberofEmails.text());

Output: 输出:

Inbox (100)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM