如何使用jsoup從HTML解析表

Question

<td width="10"></td>
<td width="65"><img src="/images/sparks/NIFTY.png" /></td> 
<td width="65">5,390.85</td>
<td width="65">5,428.15</td>
<td width="65">5,376.15</td>
<td width="65">5,413.85</td>

這是HTML源代碼，我必須從中提取值5390.85,5428.15,5376.15,5413.85。 我想用jsoup來做這件事。 但我對jsoup相對較新（今天我開始使用它）。 那我該怎么做呢？

URL url = new URL("http://www.nseindia.com/content/equities/niftysparks.htm");
Document doc = Jsoup.parse(url,3*1000);
String text = doc.body().text();

我已經使用jsoup提取了網站的內容。 但如何提取我需要的值？ 提前致謝

Answer 1

嘗試這樣的事情： -

URL url = new URL("http://www.nseindia.com/content/equities/niftysparks.htm");
Document doc = Jsoup.parse(url, 3000);

Element table = doc.select("table[class=niftyd]").first();

Iterator<Element> ite = table.select("td[width=65]").iterator();

ite.next(); // first one is image, skip it

System.out.println("Value 1: " + ite.next().text());
System.out.println("Value 2: " + ite.next().text());
System.out.println("Value 3: " + ite.next().text());
System.out.println("Value 4: " + ite.next().text());

這是打印輸出： -

Value 1: 5,390.85
Value 2: 5,428.15
Value 3: 5,376.15
Value 4: 5,413.85

Answer 2

這是使用Groovy lang的示例：

def url = "http://www.espn.co.uk/scrum/rugby/match/scores/recent.html"
def doc = Jsoup.connec(url).get()

//Strip the table from the page
def table = doc.select("table").first()
// Strip the rows from the table
def tbRows = table.select("tr")

// For each column in a row, print its contents if not empty
tbRows.each { row ->
    def tbCol = row.select("td")
    tbCol.each { column ->
        if(!column.text().empty) {
            println column.text()
        }
    }
}

您可以將其保存到陣列以進行進一步處理。 只是另一種觀點。

如何使用jsoup從HTML解析表

問題描述

2 個解決方案

解決方案1
33 已采納 2011-03-22 19:40:41

解決方案2
5 2015-01-14 12:12:00

如何使用jsoup從HTML解析表

問題描述

2 個解決方案

解決方案1 33 已采納 2011-03-22 19:40:41

解決方案2 5 2015-01-14 12:12:00

解決方案1
33 已采納 2011-03-22 19:40:41

解決方案2
5 2015-01-14 12:12:00