[英]extract element in jsoup in first level, no recursive in table
i need to display the principal row of this table, with another table nestint我需要用另一个表 nestint 显示这个表的主要行
<html><body>
<div id = div1><table><tbody>
<tr><td>Steve</td>
<td><table><tbody><tr><td>Steve2</td></tr></tbody></table>"
</tr></tbody></table></body></html>
The rows can be more than once.这些行可以不止一次。 I want to extract then content of the tr at the first level (not
<tr><td>Steve2</td></tr>
).我想在第一级提取 tr 的内容(不是
<tr><td>Steve2</td></tr>
)。
This is the code:这是代码:
String html = "<html><body>"
+ "<div id = div1><table><tbody>"
+ "<tr><td>Steve</td>"
+ "<td><table><tbody><tr><td>Steve2</td></tr></tbody></table>"
+ "</tr></tbody></table></body></html>";
doc = Jsoup.parse(html);
Elements elemHtml = doc.select("div#div1>table");
for(Element elem1:elemHtml) {
Elements elem2 = elem1.select("tr");
for(Element elem3:elem2) {
System.out.println("Content: "+elem3);
System.out.println("----------");
}
}
I tried to add <div>
tag inside the table but the parse doesn't work.我试图在表格内添加
<div>
标记,但解析不起作用。
Change your css selector to div#div1>table>tboby>tr
to map only the <tr>
that are directly under your <tobdy>
element, that's what >
means in css将 css 选择器更改为
div#div1>table>tboby>tr
到 map 只有<tr>
直接位于<tobdy>
元素下,这就是>
在 css 中的意思
I've made some more complex html, to show that the solution works for a more general case than the one in the question:我做了一些更复杂的 html,以表明该解决方案适用于比问题中的更一般的情况:
<html> <body> <div id = div1> <table> <tbody>
<tr> <td>Steve1</td> <td> <table> <tbody> <tr>
<td>Steve2a</td> </tr> <tr> <td>Steve2b</td>
</tr> </tbody> </table> </tr> <tr> <td>Steve3</td>
<td> <table> <tbody> <tr> <td>Steve4</td> </tr>
</tbody> </table> </tr> </tbody> </table>
</body> </html>
which results in the following table:结果如下表:
Use the following selector to get all the table's rows - div#div1>table> tbody > tr
使用以下选择器获取表格的所有行 -
div#div1>table> tbody > tr
and then iterate over these rows to get the first row - select("td").first()
.然后遍历这些行以获得第一行 -
select("td").first()
。
Full code -完整代码 -
Document doc = null;
String html2 = "<html> <body> <div id = div1> <table> <tbody>" +
"<tr> <td>Steve1</td> <td> <table> <tbody> <tr>" +
"<td>Steve2a</td> </tr> <tr> <td>Steve2b</td>" +
"</tr> </tbody> </table> </tr> <tr> <td>Steve3</td>" +
"<td> <table> <tbody> <tr> <td>Steve4</td> </tr>" +
"</tbody> </table> </tr> </tbody> </table>" +
"</body> </html>";
doc = Jsoup.parse(html2);
Elements outerRows = doc.select("div#div1>table> tbody > tr");
for(Element row : outerRows) {
Element data = row.select("td").first();
System.out.println(data);
System.out.println("------------");
}
If you want only the text (SteveX) than you can get it with the text
method:如果您只想要文本 (SteveX),则可以使用
text
方法获取它:
System.out.println(data.text());
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.