There is a bunch of html elements as following:
<div class="abcdefghijk">
<p>a</p>
<p>b</p>
<p>c</p>
<p>d</p>
<p>e</p>
<p>f</p>
<p>h</p>
<p>i</p>
<p>j</p>
<p>k</p>
</div>
I want to select the first 5 <p>
elements. Please help!
From https://jsoup.org/cookbook/extracting-data/selector-syntax we can learn about:
:lt(n)
: find elements whose sibling index (ie its position in the DOM tree relative to its parent) is less thann
; egtd:lt(3)
So based on your example all you need is select("div.abcdefghijk p:lt(5)")
.
Demo:
String html = " <div class=\"abcdefghijk\">\r\n" +
" <p>a</p>\r\n" +
" <p>b</p>\r\n" +
" <p>c</p>\r\n" +
" <p>d</p>\r\n" +
" <p>e</p>\r\n" +
" <p>f</p>\r\n" +
" <p>h</p>\r\n" +
" <p>i</p>\r\n" +
" <p>j</p>\r\n" +
" <p>k</p>\r\n" +
"</div>";
Document doc = Jsoup.parse(html);
Elements elements = doc.select("div.abcdefghijk p:lt(5)");
for (Element el : elements){
System.out.println(el);
}
Output:
<p>a</p>
<p>b</p>
<p>c</p>
<p>d</p>
<p>e</p>
To achieve expected result , use nth child selector
:nth-child(-n+5)
select("div.abcdefghijk :nth-child(-n+5)")
If you want to select all of them anyway, but do something special with the first 5, use Elements#subList(fromIndex, toIndex)
(inherited from ArrayList
):
Returns a view of the portion of this list between the specified
fromIndex
, inclusive, andtoIndex
, exclusive.
String html =
"<div class=\"abcdefghijk\">" +
"<p>a</p><p>b</p><p>c</p><p>d</p><p>e</p>" + // get these
"<p>f</p><p>h</p><p>i</p><p>j</p><p>k</p>" +
"</div>";
Document doc = Jsoup.parse(html);
Elements paras = doc.select("div.abcdefghijk p");
for (Element el : paras.subList(0, Math.min(5, paras.size())) {
System.out.println(el);
}
Output:
<p>a</p>
<p>b</p>
<p>c</p>
<p>d</p>
<p>e</p>
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.