简体   繁体   中英

Get full HTML using Jsoup

I'am scraping web-page using Jsoup library by selecting class attributes which contains "nav" string in them.

This is the code which fetch HTML of the site:

var bodyString = Jsoup.connect(url)
                .ignoreContentType(true)
                .userAgent("Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0")
                .timeout(12000)
                .followRedirects(true)
                .execute()
                .body();

Example of Html which is selected by Jsoup CSS selector: 在此处输入图片说明

Yet in browser same website Html looks like this: 在此处输入图片说明

As you can see ul element with id="varPreviewMenu" contains li elements which HTML retrieved by Jsoup does not contains.

How can I get those elements?

Most likely the elements you see are dynamically added to the DOM by some JavaScript code. That means they are not available in the body of the request when you use Jsoup.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM