Suppose I have html like this:
<div class="pets">
<div class="pet">...</div>
<div class="pet">...</div>
<div class="pet">...</div>
<div class="pet">...</div>
<div class="friends-pets">Your friends have these pets:</div>
<div class="pet">...</div>
<div class="pet">...</div>
<div class="pet">...</div>
<div class="pet">...</div>
<div class="pet">...</div>
<div class="pet">...</div>
</div>
I want to only get <div class="pet">
that come before <div class="friends-pets">
. Is there a way to do it with Jsoup? I know I can get all pets like this:
Element petsWrapper = document.selectFirst(".pets");
Elements pets = petsWrapper.select(".pet");
but that would include the extra pets too. I was wondering if I could only select the above pets or just remove the below pets and then use that code?
There is a very simple way you can do it with a single selector:
.pet:not(.friends-pets ~ .pet)
This works by using the :not()
selector with .friends-pets ~ .pet
finding each div after the .friends-pets
class. It then excludes those from the rest of the .pet
class matches.
See an working online example here: try.jsoup
Explanation in comments:
Element petsWrapper = document.selectFirst(".pets");
Elements pets = petsWrapper.select(".pet");
// select middle element
Element middleElement = petsWrapper.selectFirst(".friends-pets");
// remove from "pets" every element that comes after the middle element
pets.removeAll(middleElement.nextElementSiblings());
System.out.println(pets);
I'm gonna check out Krystian's answer, but having tried to solve this myself, I've come up with this one:
//get all divs
Elements divElements = doc.select("div");
//valid pet divs will be here
List<Element> pets = new ArrayList<>();
for (Element divElement: divElements) {
if (divElement.className().equalsIgnoreCase("friends-pets")) {
//invalid div, the cycle stops here
break;
}
if (divElement.className().contains("pet")) {
//if there has been no invalid div so far, adding a pet
pets.add(divElement);
}
}
If you think there's something wrong with this answer, please let me know. If you have reasons for why I should use one of the current two answers over the other, please comment too!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.