简体   繁体   English

使用jsoup获取表范围类内容

[英]get table span class content using jsoup

I have a website that contains a table that look like similar(bigger..) to this one: 我有一个网站,其中包含一个与此表相似(更大..)的表:

</table>    
<tr>
    <td>
        <table width="100%" cellspacing="-1" cellpadding="0" border="0" dir="rtl" style="padding-top: 25px;">
            <tr>
                <td align="right" style="padding-right: 25px;">
                    <span class="artist_name_txt">
                            <a href="/namelink">name</a>
                            <p class="diccografia">subname</p>
                            </span>
                </td>
            </tr>
        </table>
    </td>
</tr>

<tr>
    <td>
        <table width="100%" border="0" cellspacing="0" cellpadding="0" dir="rtl" style="padding-right: 25px; padding-left: 25px">

                <tr>
                        <td class="songs" align="right">

                                <a href="/number1link" class="artist_player_songlist">  number1</a>

                            </td>
                    </tr>
                <tr>
                        <td class="songs" align="right">

                                <a href="/number2link" class="artist_player_songlist">number2</a>


.......
            </td>   
        </tr>
</table>

and I need an idea how can i parse the website and extract this table into 2 arrays - 我需要一个想法,我如何解析网站并将该表提取为2个数组-

  • one will be something like names{number1, number2} 一个将类似于名称{number1,number2}
  • and the second will be links{number1link, number2link} 第二个是链接{number1link,number2link}

I tried a lot of ways and nothing really helps me. 我尝试了很多方法,但没有任何真正的帮助。

You should read the JSoup Cookbook - especially the Selector syntax is very powerful. 您应该阅读JSoup Cookbook-特别是Selector语法非常强大。

Here's an example: 这是一个例子:

final String html = ...
// use connect().get() instead if you connect to an website
Document doc = Jsoup.parse(html); 
List<String> names = new ArrayList<>();
List<String> links = new ArrayList<>();

for( Element element : doc.select("a.artist_player_songlist") )
{
    names.add(element.text());
    links.add(element.attr("href"));
}

System.out.println("Names: " + names);
System.out.println("Links: " + links);

Output: 输出:

Names: [number1, number2]  
Links: [/number1link, /number2link]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM