[英]Jsoup - extracting data from an <a> tag, inside a <td> tag
I want to extract data from a Web site, using Jsoup. 我想使用Jsoup从网站提取数据。 The data are in a table. 数据在表中。
HTML code: HTML代码:
<table><tr><td><a href="......">Pop.Density</a></td>
<td>123</td></tr></table>
I want to print: 我要打印:
zip code...(taken from a text file): 123
I have the following exception: 我有以下例外:
Exception in thread "main" java.lang.NullPointerException
Any help would be appreciated. 任何帮助,将不胜感激。 Thank you! 谢谢!
This is my code: 这是我的代码:
String s = br.readLine();
String str="http://www.bestplaces.net/people/zip-code/illinois/"+s;
org.jsoup.Connection conn = Jsoup.connect(str);
conn.timeout(1800000);
Document doc = conn.get();
for (Element table : doc.select("table"))
{
for (Element row : table.select("tr"))
{
Elements tds = row.select("td");
if (tds.size() > 1)
{
Element link = tds.get(0).select("a").first();
String linkText = link.text();
if (link.text().contains("Pop.Density"))
System.out.println(s+","+tds.get(1).text());
}
}
}
UPDATE: If I modify the last if(): 更新:如果我修改了最后一个if():
if (tds.get(0).select("a").text().contains("Pop.Density"))
I do not have any exceptions, but no output either. 我没有任何例外,但也没有输出。
Assuming the shared html is not the real one being used, I think its throwing the exception when first TD doesn't have <a>
tag. 假设共享的html不是真正使用的HTML,我认为当第一个TD没有<a>
标记时,它将引发异常。 I think you need to update 我认为你需要更新
if (tds.size() > 1)
as 如
if (tds.size() > 1 && tds.get(0).select("a") != null
&& tds.get(0).select("a").first() ! null)
If this is not the case, sharing the line number of NullPointerException
origin can help better finding the solution. 如果不是这种情况,则共享NullPointerException
源的行号可以帮助更好地找到解决方案。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.