简体   繁体   English

jSoup使用td类标签从网页获取数据

[英]jSoup get data using td-class tags from webpage

I would like to get data from http://www.futbol24.com/Live/?__igp=1&LiveDate=20141104 using jSoup. 我想使用jSoup从http://www.futbol24.com/Live/?__igp=1&LiveDate=20141104获取数据。 I know how to use jSoup - but I am finding it difficult to pinpoint the data that I need. 我知道如何使用jSoup-但我发现很难查明所需的数据。

I would like the Time, Home Team and Away Team from each row of the tbody table. 我希望tbody表格的每一行都有时间,主队和客队。 So the output from the first row should be: 因此,第一行的输出应为:

08:30     Persipura Jayapura      Pelita Bandung Raya

I can see the td class of each of these elements as "status alt", "home" and "guest". 我可以将每个元素的td类视为“状态alt”,“家庭”和“来宾”。

Currently I have tried the below, but it doesn't seem to output anything... what am I doing wrong? 目前,我已经尝试了以下方法,但似乎没有输出任何内容...我在做什么错?

        matches = new ArrayList<Match>();

             //getHistory
            String website = "http://www.futbol24.com/Live/?__igp=1&LiveDate=20141104";
            Document doc = Jsoup.connect(website).get();

            Element tblHeader = doc.select("tbody").first();
            List<Match> data = new ArrayList<>();
            for (Element element1 : tblHeader.children()){

                Match match = new Match();
                match.setTimeOfMatch(element1.select("td.status.alt").text());
                match.setAwayTeam(element1.select("td.home").text());
                match.setHomeTeam(element1.select("td.guest").text());

                data.add(match);
                System.out.println(data.toString());

Does anybody know how I can use jSoup to get these elements from each row of the table? 有人知道如何使用jSoup从表的每一行获取这些元素吗?

Thanks, 谢谢,

Rob

The content of this site is generated via AJAX it seems. 该站点的内容似乎是通过AJAX生成的。 Jsoup can't handle this, since it is not a browser that interprets JavaScript. Jsoup无法处理此问题,因为它不是解释JavaScript的浏览器。 To solve this scraping problem you may need something like Selenium webdriver . 要解决此抓取问题,您可能需要Selenium webdriver之类的东西。 I gave a longer answer to a generalized question about this before, so please look here: 在此之前,我对这个一般性问题给出了更长的答案,所以请在这里查看:

Jsoup get dynamically generated HTML Jsoup获取动态生成的HTML

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM