I would like to parse html from web page which use infinite scroll, such as: pinterest.com so as to get all items.
public List<String> popularTagsPinterest(String tag) throws Exception {
List<String> results = new ArrayList<>();
try {
Document doc = Jsoup.connect(
urlPinterest + tag + "&eq=%23" + tag + "&etslf=6622&term_meta[]=%23" + tag + "%7Cautocomplete%7C0")
.timeout(90000).get();
Elements img1 = doc.select("a.pinImageWrapper img.pinImg");
for (Element e : img1) {
results.add(e.attr("src"));
System.out.println(e.attr("src"));
}
} catch (Exception e) {
e.printStackTrace();
}
return results;
}
Get base url and the ajax call for loading another part can do.
Check this page, is a good example.
https://blog.scrapinghub.com/2016/06/22/scrapy-tips-from-the-pros-june-2016
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.