简体   繁体   中英

Jsoup: get all elements before a certain element / remove all elements after a certain element

Suppose I have html like this:

<div class="pets">
  <div class="pet">...</div>
  <div class="pet">...</div>
  <div class="pet">...</div>
  <div class="pet">...</div>
  <div class="friends-pets">Your friends have these pets:</div>
  <div class="pet">...</div>
  <div class="pet">...</div>
  <div class="pet">...</div>
  <div class="pet">...</div>
  <div class="pet">...</div>
  <div class="pet">...</div>
</div>

I want to only get <div class="pet"> that come before <div class="friends-pets"> . Is there a way to do it with Jsoup? I know I can get all pets like this:

Element petsWrapper = document.selectFirst(".pets");
Elements pets = petsWrapper.select(".pet");

but that would include the extra pets too. I was wondering if I could only select the above pets or just remove the below pets and then use that code?

There is a very simple way you can do it with a single selector:

.pet:not(.friends-pets ~ .pet)

This works by using the :not() selector with .friends-pets ~ .pet finding each div after the .friends-pets class. It then excludes those from the rest of the .pet class matches.

See an working online example here: try.jsoup

Explanation in comments:

Element petsWrapper = document.selectFirst(".pets");
Elements pets = petsWrapper.select(".pet");
// select middle element
Element middleElement = petsWrapper.selectFirst(".friends-pets");
// remove from "pets" every element that comes after the middle element
pets.removeAll(middleElement.nextElementSiblings());
System.out.println(pets);

I'm gonna check out Krystian's answer, but having tried to solve this myself, I've come up with this one:

//get all divs
Elements divElements = doc.select("div");
//valid pet divs will be here
List<Element> pets = new ArrayList<>();
for (Element divElement: divElements)  {
    if (divElement.className().equalsIgnoreCase("friends-pets")) {
       //invalid div, the cycle stops here 
       break;
     }

     if (divElement.className().contains("pet"))  {
        //if there has been no invalid div so far, adding a pet
        pets.add(divElement);
     }
}

If you think there's something wrong with this answer, please let me know. If you have reasons for why I should use one of the current two answers over the other, please comment too!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM