使用Jsoup從網頁中讀取指定的文本行

Question

所以我正在嘗試使用Jsoup從此網頁獲取數據...

我嘗試查找許多不同的方法來做，但我已經接近了，但我不知道如何找到某些統計信息的標簽（攻擊，力量，防御等）。

舉例來說，我想打印出來

'Attack', '15', '99', '200,000,000'

我應該怎么做呢？

Answer 1

您可以在Jsoup中使用CSS選擇器輕松提取列數據。

// retrieve page source code
Document doc = Jsoup
        .connect("http://services.runescape.com/m=hiscore_oldschool/hiscorepersonal.ws?user1=Lynx%A0Titan")
        .get();

// find all of the table rows
Elements rows = doc.select("div#contentHiscores table tr");
ListIterator<Element> itr = rows.listIterator();

// loop over each row
while (itr.hasNext()) {
    Element row = itr.next();

    // does the second col contain the word attack?
    if (row.select("td:nth-child(2) a:contains(attack)").first() != null) {

        // if so, assign each sibling col to variable
        String rank = row.select("td:nth-child(3)").text();
        String level = row.select("td:nth-child(4)").text();
        String xp = row.select("td:nth-child(5)").text();

        System.out.printf("rank=%s level=%s xp=%s", rank, level, xp);

        // stop looping rows, found attack
        break;
    }
}

Answer 2

一個非常粗略的實現如下。 我剛剛顯示了一個代碼段，需要添加優化或其他條件

    public static void main(String[] args) throws Exception {
    Document doc = Jsoup
            .connect("http://services.runescape.com/m=hiscore_oldschool/hiscorepersonal.ws?user1=Lynx%A0Titan")
            .get();
    Element contentHiscoresDiv = doc.getElementById("contentHiscores");
    Element table = contentHiscoresDiv.child(0);
    for (Element row : table.select("tr")) {
        Elements tds = row.select("td");
        for (Element column : tds) {
            if (column.children() != null && column.children().size() > 0) {
                Element anchorTag = column.getElementsByTag("a").first();
                if (anchorTag != null && anchorTag.text().contains("Attack")) {
                    System.out.println(anchorTag.text());
                    Elements attributeSiblings = column.siblingElements();
                    for (Element attributeSibling : attributeSiblings) {
                        System.out.println(attributeSibling.text());

                    }

                }
            }
        }
    }
}

攻擊

15 99 200,000,000

使用Jsoup從網頁中讀取指定的文本行

問題描述

2 個解決方案

解決方案1
1 2016-07-25 15:47:08

解決方案2
0 2016-07-24 03:48:28

使用Jsoup從網頁中讀取指定的文本行

問題描述

2 個解決方案

解決方案1 1 2016-07-25 15:47:08

解決方案2 0 2016-07-24 03:48:28

解決方案1
1 2016-07-25 15:47:08

解決方案2
0 2016-07-24 03:48:28