简体   繁体   English

如何使用jsoup从HTML解析表

[英]how to parse a table from HTML using jsoup

<td width="10"></td>
<td width="65"><img src="/images/sparks/NIFTY.png" /></td> 
<td width="65">5,390.85</td>
<td width="65">5,428.15</td>
<td width="65">5,376.15</td>
<td width="65">5,413.85</td>

This is the HTML source from which i have to extract the values 5390.85,5428.15 , 5376.15 , 5413.85. 这是HTML源代码,我必须从中提取值5390.85,5428.15,5376.15,5413.85。 I wanted to do this using jsoup. 我想用jsoup来做这件事。 But i am relatively new to jsoup( today i started using it). 但我对jsoup相对较新(今天我开始使用它)。 So how should i do this? 那我该怎么做呢?

URL url = new URL("http://www.nseindia.com/content/equities/niftysparks.htm");
Document doc = Jsoup.parse(url,3*1000);
String text = doc.body().text();

I have already extracted the content of the website using jsoup. 我已经使用jsoup提取了网站的内容。 but how to extract the values i require? 但如何提取我需要的值? Thanks in advance 提前致谢

Try something like this:- 尝试这样的事情: -

URL url = new URL("http://www.nseindia.com/content/equities/niftysparks.htm");
Document doc = Jsoup.parse(url, 3000);

Element table = doc.select("table[class=niftyd]").first();

Iterator<Element> ite = table.select("td[width=65]").iterator();

ite.next(); // first one is image, skip it

System.out.println("Value 1: " + ite.next().text());
System.out.println("Value 2: " + ite.next().text());
System.out.println("Value 3: " + ite.next().text());
System.out.println("Value 4: " + ite.next().text());

Here's the printout:- 这是打印输出: -

Value 1: 5,390.85
Value 2: 5,428.15
Value 3: 5,376.15
Value 4: 5,413.85

Here's an example using Groovy lang: 这是使用Groovy lang的示例:

def url = "http://www.espn.co.uk/scrum/rugby/match/scores/recent.html"
def doc = Jsoup.connec(url).get()

//Strip the table from the page
def table = doc.select("table").first()
// Strip the rows from the table
def tbRows = table.select("tr")

// For each column in a row, print its contents if not empty
tbRows.each { row ->
    def tbCol = row.select("td")
    tbCol.each { column ->
        if(!column.text().empty) {
            println column.text()
        }
    }
}

You could save this to an array for further processing. 您可以将其保存到阵列以进行进一步处理。 Just another perspective. 只是另一种观点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM