简体   繁体   English

使用Jsoup库从指定表中获取数据的Web抓取

[英]Web scraping using Jsoup Library to fetch data from a given Table

So I'm trying to scrape some data from a WebPage, but unable to do so. 因此,我正在尝试从网页中抓取一些数据,但无法这样做。 I tried doing it using substring() but that's very inefficient. 我尝试使用substring()做到这一点,但这效率很低。 Here's part of the code which I've written : 这是我编写的部分代码:

           Elements links;

           Element link;

           String url = "https://www.premierleague.com/tables";

           Document document = Jsoup.connect(url).get();

           links = document.select("table");

           org.jsoup.nodes.Element table = document.select("table").get(0); 

           Elements rows = table.select("tr");

           org.jsoup.nodes.Element row = rows.get(1);

           Elements cols = row.select("td");

Can anyone help me by giving a few examples from the same link ? 有人可以通过同一链接中的几个例子来帮助我吗?

    String url = "https://www.premierleague.com/tables";
    Document doc = Jsoup.connect(url).get();
    Element table = doc.select("table").first();
    Iterator<Element> team = table.select("td[class=team]").iterator();
    Iterator<Element> rank = table.select("td[id=tooltip]").iterator();
    Iterator<Element> points = table.select("td[class=points]").iterator();
    System.out.println(team.next().text());
    System.out.println(rank.next().text()); 
    System.out.println(points.next().text());

output: 输出:

ChelseaCHE
1 Previous Position 1
46

Edit: to respond to your question: 编辑:回答您的问题:

        System.out.println(team.next().text());
        System.out.println(rank.next().text());
        System.out.println(points.next().text());
        team.next();
        team.next();
        team.next();

        rank.next();
        rank.next();
        rank.next();

        points.next();
        points.next();
        points.next();

        System.out.println(team.next().text());
        System.out.println(rank.next().text());
        System.out.println(points.next().text());

output: 输出:

ChelseaCHE
1 Previous Position 1
46
Tottenham HotspurTOT
5 Previous Position 5
33

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM