简体   繁体   中英

extract and group elements together with jsoup

I am trying to get this output:

** * * Movie Titles: ** * **
World War Z
** * ** Casts: ** * ***
Brad Pitt
Mireille Enos
James Badge Dale

** * * Movie Titles: ** * **
Monsters University
** * ** Casts: ** * ***
Johnny Depp
Watsons Junior

<h2 itemprop="name">World War Z</h2>
<div class=info>‎1hr 56min‎‎ - Rated PG13‎‎ - Action/Drama/Horror‎‎ - English‎<br>
 - Cast: 
<span itemprop="actors">Brad Pitt</span>, 
<span itemprop="actors">Mireille Enos</span>, 
<span itemprop="actors">James Badge Dale</span>
</div>

<h2 itemprop="name">Monsters University</h2>
<div class=info>‎2hr 30min‎‎ - Rated PG13‎‎ - Comedy‎‎ - English‎<br>
 - Cast: 
<span itemprop="actors">Johnny Depp</span>, 
<span itemprop="actors">Watsons Junior</span>
</div>

I've tried doing this:

    Elements movieTitle = doc.select("h2");
    for (Element src : movieTitle) {
        for (int i = 0; i < movieTitle.size(); ++i) {
            title += movieTitle.get(i).text() + "\n";
        }
        break;
    }

    Elements casts = doc.select("span[itemprop=actors]");
    for (Element sr : casts) {
        for (int i = 0; i < casts.size(); ++i) {
            cast += casts.get(i).text() + "\n";
        }
        break;
    }
System.out.println("*************Movie Titles:************* \n" + title);
System.out.println("*************Casts:************* \n" + cast);

But the output is:

** * * Movie Titles: ** * **
World War Z
Monsters University

** * ** Casts: ** * ***
Brad Pitt
Mireille Enos
James Badge Dale
Johnny Depp
Watsons Junior

How do I group the casts according to the movies?

This will give you the results in desired format.

 Elements items = doc.select("h2");
    for (Element movieElement : items) {

        //Here you get movie name from movieElement
        Elements castElemets =  movieElement.nextElementSibling().select("span[itemprop=actors]");
        //loop through the castElemnts for corresponding Movie
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM