如何在Jsoup中獲取特定html元素的內容？

Question

我目前正在嘗試使用jsoup從Wikipedia獲取一個表及其內容/格式。 但是，當我運行此代碼時，在第29行出現錯誤：

project.wikiclass.main（wikiclass.java:29）上的線程“ main”中的異常java.lang.NullPointerException

我不知道如何獲取數據。 我當前使用的名稱似乎不正確。 表格位於：

https://zh.wikipedia.org/wiki/利物浦足球俱樂部＃First-team_squad

在inspect元素中，所需的最外面的元素稱為<table border="0"> 。

但是我不能使用名稱邊框通過id獲取元素。 如果有人可以告訴我如何獲取此元素或其真實名稱，那將很有幫助。 通過轉到鏈接頁面並突出顯示名稱列表並使用inspect元素，可以找到該元素。

import java.io.IOException;    
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class wikiclass {

  public static void main(String[] args) {

    Document doc;
    try {

        // need http protocol
        doc = Jsoup.connect("https://en.wikipedia.org/wiki/Liverpool_F.C.").get();

        // get page title
        String title = doc.title();
        System.out.println("title : " + title);

        //make html file
        StringBuffer html = new StringBuffer();

        // get all links
        String table = doc.getElementById("border").outerHtml();
        System.out.println(table);
        /*for (Element link : links) {

            // get the value from href attribute
            System.out.println("\nlink : " + link.attr("href"));
            System.out.println("text : " + link.text());

        }*/

    } catch (IOException e) {
        e.printStackTrace();
    }

  }

}

Answer 1

我認為您擁有NPE，因為Jsoup找不到此元素。

你可以試試這個

 Elements table = doc.select("div#bodyContent table.infobox");

然后迭代ech元素並獲取信息

如何在Jsoup中獲取特定html元素的內容？

問題描述

1 個解決方案

解決方案1
1 2016-12-13 06:53:25

如何在Jsoup中獲取特定html元素的內容？

問題描述

1 個解決方案

解決方案1 1 2016-12-13 06:53:25

解決方案1
1 2016-12-13 06:53:25