簡體   English   中英

使用Jsoup從網頁中讀取指定的文本行

[英]Read a specified line of text from a webpage with Jsoup

所以我正在嘗試使用Jsoup從此網頁獲取數據...

我嘗試查找許多不同的方法來做,但我已經接近了,但我不知道如何找到某些統計信息的標簽( 攻擊力量防御等)。

舉例來說,我想打印出來

'Attack', '15', '99', '200,000,000' 

我應該怎么做呢?

您可以在Jsoup中使用CSS選擇器輕松提取列數據。

// retrieve page source code
Document doc = Jsoup
        .connect("http://services.runescape.com/m=hiscore_oldschool/hiscorepersonal.ws?user1=Lynx%A0Titan")
        .get();

// find all of the table rows
Elements rows = doc.select("div#contentHiscores table tr");
ListIterator<Element> itr = rows.listIterator();

// loop over each row
while (itr.hasNext()) {
    Element row = itr.next();

    // does the second col contain the word attack?
    if (row.select("td:nth-child(2) a:contains(attack)").first() != null) {

        // if so, assign each sibling col to variable
        String rank = row.select("td:nth-child(3)").text();
        String level = row.select("td:nth-child(4)").text();
        String xp = row.select("td:nth-child(5)").text();

        System.out.printf("rank=%s level=%s xp=%s", rank, level, xp);

        // stop looping rows, found attack
        break;
    }
}

一個非常粗略的實現如下。 我剛剛顯示了一個代碼段,需要添加優化或其他條件

    public static void main(String[] args) throws Exception {
    Document doc = Jsoup
            .connect("http://services.runescape.com/m=hiscore_oldschool/hiscorepersonal.ws?user1=Lynx%A0Titan")
            .get();
    Element contentHiscoresDiv = doc.getElementById("contentHiscores");
    Element table = contentHiscoresDiv.child(0);
    for (Element row : table.select("tr")) {
        Elements tds = row.select("td");
        for (Element column : tds) {
            if (column.children() != null && column.children().size() > 0) {
                Element anchorTag = column.getElementsByTag("a").first();
                if (anchorTag != null && anchorTag.text().contains("Attack")) {
                    System.out.println(anchorTag.text());
                    Elements attributeSiblings = column.siblingElements();
                    for (Element attributeSibling : attributeSiblings) {
                        System.out.println(attributeSibling.text());

                    }

                }
            }
        }
    }
}

攻擊

15 99 200,000,000

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM