简体   繁体   中英

How to select a table from Wikipedia in jsoup?

I am trying to write an application in java, that uses information obtained here https://en.wikipedia.org/wiki/List_of_cities_in_Switzerland . Specifically, I need a list of cities in Switzerland, which I have to extract from the table in the link. I need to use Jsoup to do it, but I am having some trouble doing that. Specifically, my program cannot "see" or select this specific table. I've tried several methods, and spent hours trying to figure it out, to no avail. I have managed to select the tables in the bottom of the page, about "Switzerland articles" and "List of cities in europe", with

Document doc = Jsoup.connect("https://en.wikipedia.org/wiki/List_of_cities_in_Switzerland").get();
Elements table = doc.select("table");

but, for some reason, it seems to "skip" the table I'm looking for. (The array has seemingly empty tables for table[0] - table[2], and table[3] is the "Switzerland articles" one.) The "Copy selector" option Chrome gives me did not work either, as the output was an size=0 array, and I got a null pointer exception from it when trying to parse it. I am very new to HTML and Jsoup, and cannot understand where my problem is.

Use this selector -

doc.select(".wikitable");

Use also the User Agent string that matches your browser, to make sure you get the same result in your browser and in your application, like this -

Document doc = Jsoup.connect("https://en.wikipedia.org/wiki/List_of_cities_in_Switzerland")
            .userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0")
            .get();
Elements table = doc.select(".wikitable");

Have you tried:

doc.select("div.mw-content-text > table.wikitable");

It should work according to the doc page of jsoup: https://jsoup.org/apidocs/org/jsoup/select/Selector.html

You can look further in it :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM