简体   繁体   中英

Java jsoup selecting links

I am trying to develop web scraper, I can extract all the links from a page, but I want to get some specific ones, I checked but I could not manage it as I dont have good knowledge in HTML

在此处输入图片说明

 Element divcontent = doc.select("div.content").first();
 Element ul = divcontent.select("ul.indepth-list").first();
 ul.select("a[href]");

Written without editor so i can't remember if the syntax is correct.

You can use the CSS selector presented in the snippet below:

doc.select("div.indepth-content > div.content > ul.indepth-list a")

On the screenshot, it seems you're using Chrome browser. If so, next time you can ask it to generate the CSS query for you:

  1. Right click on the element you target
  2. Click on "Inspect" ( a node should appear selected )
  3. Right click on this node then select Copy entry and Copy selector sub-entry

=> The CSS selector is copied in the clipboard

Please note that Chrome tends to generate (very) long CSS queries. Also, it can't generate CSS selectors for matching multiple elements.

However, if you type CTRL + F while the DevTools pane is opened and Elements tab selected, you can type a CSS selector and browse among the matched elements.


For more details, you can have look at the following resources:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM