I have this piece of HTML:
<td class="my class" >
<div class="content" style="margin-left:10px;">
<ul style="list-style-type: disc;">
<li><span>obj: blue</span></li>
<li><span>descr: red</span></li>
<li><span>double: yellow</span></li>
</ul>
</div>
</td>
I need to have:
obj: blue
descr: red
double: yellow
I already tried:
docDescription.select("my.class").text();
But it returns the block, with all the text. I need 3 different parts (line by line).
docDescription.select("div > ul > li > span");
Your document is invalid and look like below for JSoup. JSoup always tries to fix document. In your case td
is outside of any table
so it is removed.
<html>
<head></head>
<body>
<div class="content" style="margin-left:10px;">
<ul style="list-style-type: disc;">
<li><span>obj: blue</span></li>
<li><span>descr: red</span></li>
<li><span>double: yellow</span></li>
</ul>
</div>
</body>
</html>
public static void main(String[] args) {
String html = "<td class=\"my class\" >\n" +
" <div class=\"content\" style=\"margin-left:10px;\">\n" +
" <ul style=\"list-style-type: disc;\">\n" +
" <li><span>obj: blue</span></li>\n" +
" <li><span>descr: red</span></li>\n" +
" <li><span>double: yellow</span></li>\n" +
" </ul>\n" +
" </div>\n" +
"</td>";
Elements select = Jsoup.parse(html).select("div > ul > li > span");
for (Element element : select) {
System.out.println(element.text());
}
}
obj: blue
descr: red
double: yellow
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.