Jsoup-网站上表格数据的排列

Question

I want to get the table from https://ms.wikipedia.org/wiki/Malaysia . 我想从https://ms.wikipedia.org/wiki/马来西亚获取表格。 Here is the table I want from the website. 这是我要从网站上获得的表格。

But the result is not what I want. 但是结果不是我想要的。

I have got 2 questions : 我有2个问题 ：

1st Question is how can I arrange them like a table with arrangement Row and Column similar with the table from my picture. 第一个问题是如何将它们像表格一样排列，行和列的排列方式与图片中的表格相似。 Below is my source code on how i get the data. 以下是我如何获取数据的源代码。

String URL = "https://ms.wikipedia.org/wiki/Malaysia";
Document doc = Jsoup.connect(URL).get();
Elements trs = doc.select("#mw-content-text > div > table:nth-child(148)");
String currentRow = null;
for (Element tr : trs){
    Elements tdDay = tr.select("tr:has(th)");
        currentRow = tdDay.text();
        System.out.print(currentRow);
}

2nd Question is from my source code, is it the best way to scraping the particular data from all the element like for example the element from the website https://ms.wikipedia.org/wiki/Malaysia by using 第二个问题来自我的源代码，这是从所有元素（例如网站https://ms.wikipedia.org/wiki/马来西亚）中的元素中抓取特定数据的最佳方法吗

Elements trs = doc.select("#mw-content-text > div > table:nth-child(148)");

Because from the website, there have got 3 table class with name wikitable. 因为从网站上，有3个表类，名称为wikitable。 <table class="wikitable"> . <table class="wikitable"> 。 So how can I call only particular table? 那么，如何才能只调用特定的表呢？

Answer 1

Since the website u provide has some wikitable in it. 由于您提供的网站中包含一些wikitable 。 So u can try to find out the selector of the data from table and I found there is <td> and <th> . 因此，您可以尝试从表中找出数据的选择器，而我发现有<td>和<th> 。

for (int i = x; i < x; i++) {
    Elements trs = doc.select("#mw-content-text > div > table:nth-child(148) > tbody > tr:nth-child(" + i + ") > th");
    Elements tds = doc.select("#mw-content-text > div > table:nth-child(148) > tbody > tr:nth-child(" + i + ") > td");

try this while the x in the for loops is the number of row in the table so it can scrape the data 试试这个，而for循环中的x是表中的行数，这样它就可以抓取数据

Answer 2

public static void main(String[] args) throws IOException{
    String URL = "https://ms.wikipedia.org/wiki/Malaysia";
    Document doc = Jsoup.connect(URL).get();
    //Select the table which is under the header containing "Trivia" 
    //having the value "wikitable" for the class attribute
    Element table = doc.select("h2:contains(Trivia)+[class=\"wikitable\"]").first();
    //then select each row of the table 
    Elements trs = table.select("tr");
    //for each row get first and second child corresponding to column 1 and two of table
    for (Element tr : trs){
        Element th = tr.child(0);
        Element td = tr.child(1);
        System.out.printf("%-40s %-40s%n",th.text(), td.text());
    }
}

Jsoup-网站上表格数据的排列

问题描述

2 个解决方案

解决方案1
0 2018-10-17 16:26:23

解决方案2
0 2018-10-18 13:55:17

Jsoup-网站上表格数据的排列

问题描述

2 个解决方案

解决方案1 0 2018-10-17 16:26:23

解决方案2 0 2018-10-18 13:55:17

解决方案1
0 2018-10-17 16:26:23

解决方案2
0 2018-10-18 13:55:17