简体   繁体   English

与jsoup一起提取和分组元素

[英]extract and group elements together with jsoup

I am trying to get this output: 我正在尝试获得以下输出:

** * * Movie Titles: ** * * 电影标题: ** * ** ** * **
World War Z 二次世界大战
** * ** Casts: ** * ** 演员表: ** * *** ** * ***
Brad Pitt 布拉德·皮特
Mireille Enos 米雷耶·埃诺斯(Mireille Enos)
James Badge Dale 詹姆斯·徽章·戴尔

** * * Movie Titles: ** * * 电影标题: ** * ** ** * **
Monsters University 怪兽大学
** * ** Casts: ** * ** 演员表: ** * *** ** * ***
Johnny Depp 约翰尼·德普
Watsons Junior 屈臣氏初级

<h2 itemprop="name">World War Z</h2>
<div class=info>‎1hr 56min‎‎ - Rated PG13‎‎ - Action/Drama/Horror‎‎ - English‎<br>
 - Cast: 
<span itemprop="actors">Brad Pitt</span>, 
<span itemprop="actors">Mireille Enos</span>, 
<span itemprop="actors">James Badge Dale</span>
</div>

<h2 itemprop="name">Monsters University</h2>
<div class=info>‎2hr 30min‎‎ - Rated PG13‎‎ - Comedy‎‎ - English‎<br>
 - Cast: 
<span itemprop="actors">Johnny Depp</span>, 
<span itemprop="actors">Watsons Junior</span>
</div>

I've tried doing this: 我尝试这样做:

    Elements movieTitle = doc.select("h2");
    for (Element src : movieTitle) {
        for (int i = 0; i < movieTitle.size(); ++i) {
            title += movieTitle.get(i).text() + "\n";
        }
        break;
    }

    Elements casts = doc.select("span[itemprop=actors]");
    for (Element sr : casts) {
        for (int i = 0; i < casts.size(); ++i) {
            cast += casts.get(i).text() + "\n";
        }
        break;
    }
System.out.println("*************Movie Titles:************* \n" + title);
System.out.println("*************Casts:************* \n" + cast);

But the output is: 但是输出是:

** * * Movie Titles: ** * * 电影标题: ** * ** ** * **
World War Z 二次世界大战
Monsters University 怪兽大学

** * ** Casts: ** * ** 演员表: ** * *** ** * ***
Brad Pitt 布拉德·皮特
Mireille Enos 米雷耶·埃诺斯(Mireille Enos)
James Badge Dale 詹姆斯·徽章·戴尔
Johnny Depp 约翰尼·德普
Watsons Junior 屈臣氏初级

How do I group the casts according to the movies? 如何根据电影对演员进行分组?

This will give you the results in desired format. 这将为您提供所需格式的结果。

 Elements items = doc.select("h2");
    for (Element movieElement : items) {

        //Here you get movie name from movieElement
        Elements castElemets =  movieElement.nextElementSibling().select("span[itemprop=actors]");
        //loop through the castElemnts for corresponding Movie
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM