HtmlUnit - scraping data

Question

How using HtmlUnit I can extract page which contains javascript as HTML? I found sample code as below but not working.

public class Downloader {

        public static void main(String[] args) throws Exception {
            LogFactory.getFactory().setAttribute("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.NoOpLog");

            java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(Level.OFF); 
            java.util.logging.Logger.getLogger("org.apache.commons.httpclient").setLevel(Level.OFF);

            try (final WebClient webClient = new WebClient()) {
                final HtmlPage page = webClient.getPage("https://www.oddsportal.com/matches/soccer/");
                System.out.println(page.asText());
            }
            System.out.println("END");
        }
}

With this code I landing in infinite loop. I don't know why. If I open above site in firefox inspector I can see full HTML code after executing javascript. How I can reach the same result with HtmlUnit. It is possible? Maybe I should using any other library? Any suggestions?

Answer 1

HtmlUnit tends to have a lot of problems with interpreting javascript. If you are just looking for the game data, you might be more successful otherwise: https://github.com/gingeleski/odds-portal-scraper

Anyways, i managed to get the code working with changing the BrowserVersion: final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_60)

HtmlUnit - scraping data

Question

1 answers

solution1
0 ACCPTED 2019-09-21 09:15:50

HtmlUnit - scraping data

Question

1 answers

solution1 0 ACCPTED 2019-09-21 09:15:50

solution1
0 ACCPTED 2019-09-21 09:15:50