[英]extract element in jsoup in first level, no recursive
I have this html. 我有这个HTML。 I need the "li".
我需要“李”。 I use .select("li"), but inside each "li", there may be another "li" but I'm not interested.
我使用.select(“ li”),但是在每个“ li”内可能还有另一个“ li”,但我对此并不感兴趣。 I only want "li" in first level.
我只想要“ li”第一级。 It possible?
有可能吗
<div id="id">
<ul>
<li>
<div>
<ul>
<li> ........ </li>
</ul>
</div>
</li>
<li> ........ </li>
<li> ........ </li>
<li> ........ </li>
<li> ........ </li>
<li> ........ </li>
.
.
.
</ul>
</div>
It's even simpler - use CSS selector like 更简单-使用CSS选择器
Document.select('div#id > ul > li')
When you use ">" you tell that all you want is the first level child of given DOM element. 当使用“>”时,您会告诉您所需的只是给定DOM元素的第一级子级。 Take a look at this code - https://gist.github.com/wololock/621a42546cac6dd0daa2 You can simply run it as a groovy script.
看看这段代码- https://gist.github.com/wololock/621a42546cac6dd0daa2您可以简单地运行它作为一个Groovy脚本。
Sure it is: 当然是啦:
Elements documentElements = document.getElementsByTag("id"); //get the div
Element theList= documentElements.get(0); //get the unordered list
Element listItem = theList.child(0); //this is the first list item in that unordered list
This answer assumes you've already loaded the HTML and have the JSoup Document
ready for traversing 该答案假设您已经加载了HTML并已准备好遍历JSoup
Document
Reference: 参考:
You have to use the CSS selector >
to specify that you want direct children only. 您必须使用CSS选择器
>
来指定只需要直接子级。
This can be done relatively to the element as in the following example: 可以相对于元素来完成此操作,如以下示例所示:
Element div = Jsoup.parseBodyFragment("<div id="id">...</div>").body();
div.select(">div>ul>li"); // this will return all li under the first ul
And from a ul element, retrieving all first level li
: 从ul元素中检索所有第一级
li
:
ul.select(">li");
Or in an absolute way (cf @Szymon answer): 或以绝对方式(cf @Szymon答案):
Document.select('div#id > ul > li')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.