简体   繁体   中英

jSoup get data using td-class tags from webpage

I would like to get data from http://www.futbol24.com/Live/?__igp=1&LiveDate=20141104 using jSoup. I know how to use jSoup - but I am finding it difficult to pinpoint the data that I need.

I would like the Time, Home Team and Away Team from each row of the tbody table. So the output from the first row should be:

08:30     Persipura Jayapura      Pelita Bandung Raya

I can see the td class of each of these elements as "status alt", "home" and "guest".

Currently I have tried the below, but it doesn't seem to output anything... what am I doing wrong?

        matches = new ArrayList<Match>();

             //getHistory
            String website = "http://www.futbol24.com/Live/?__igp=1&LiveDate=20141104";
            Document doc = Jsoup.connect(website).get();

            Element tblHeader = doc.select("tbody").first();
            List<Match> data = new ArrayList<>();
            for (Element element1 : tblHeader.children()){

                Match match = new Match();
                match.setTimeOfMatch(element1.select("td.status.alt").text());
                match.setAwayTeam(element1.select("td.home").text());
                match.setHomeTeam(element1.select("td.guest").text());

                data.add(match);
                System.out.println(data.toString());

Does anybody know how I can use jSoup to get these elements from each row of the table?

Thanks,

Rob

The content of this site is generated via AJAX it seems. Jsoup can't handle this, since it is not a browser that interprets JavaScript. To solve this scraping problem you may need something like Selenium webdriver . I gave a longer answer to a generalized question about this before, so please look here:

Jsoup get dynamically generated HTML

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM