简体   繁体   English

在第一级的jsoup中提取元素,没有递归

[英]extract element in jsoup in first level, no recursive

I have this html. 我有这个HTML。 I need the "li". 我需要“李”。 I use .select("li"), but inside each "li", there may be another "li" but I'm not interested. 我使用.select(“ li”),但是在每个“ li”内可能还有另一个“ li”,但我对此并不感兴趣。 I only want "li" in first level. 我只想要“ li”第一级。 It possible? 有可能吗

<div id="id">
    <ul>
        <li>  
            <div>
                <ul>
                    <li> ........ </li>
                </ul>
            </div>      
        </li>
        <li> ........ </li>
        <li> ........ </li>
        <li> ........ </li>
        <li> ........ </li>
        <li> ........ </li>
        .
        .
        .
    </ul>
</div>

It's even simpler - use CSS selector like 更简单-使用CSS选择器

Document.select('div#id > ul > li')

When you use ">" you tell that all you want is the first level child of given DOM element. 当使用“>”时,您会告诉您所需的只是给定DOM元素的第一级子级。 Take a look at this code - https://gist.github.com/wololock/621a42546cac6dd0daa2 You can simply run it as a groovy script. 看看这段代码- https://gist.github.com/wololock/621a42546cac6dd0daa2您可以简单地运行它作为一个Groovy脚本。

Sure it is: 当然是啦:

Elements documentElements = document.getElementsByTag("id"); //get the div
Element theList= documentElements.get(0); //get the unordered list
Element listItem = theList.child(0); //this is the first list item in that unordered list

This answer assumes you've already loaded the HTML and have the JSoup Document ready for traversing 该答案假设您已经加载了HTML并已准备好遍历JSoup Document

Reference: 参考:

You have to use the CSS selector > to specify that you want direct children only. 您必须使用CSS选择器>来指定只需要直接子级。

This can be done relatively to the element as in the following example: 可以对于元素来完成此操作,如以下示例所示:

Element div = Jsoup.parseBodyFragment("<div id="id">...</div>").body();
div.select(">div>ul>li"); // this will return all li under the first ul

And from a ul element, retrieving all first level li : 从ul元素中检索所有第一级li

ul.select(">li");

Or in an absolute way (cf @Szymon answer): 或以绝对方式(cf @Szymon答案):

Document.select('div#id > ul > li')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM