[英]Get data from table using jSoup
I am looking to get data from the table on http://www.sportinglife.com/greyhounds/abc-guide using jSoup. 我希望使用jSoup从http://www.sportinglife.com/greyhounds/abc-guide上的表中获取数据。 I would like to put this data into some kind of table within my java program that I can then use in my code.
我想将此数据放入Java程序中的某种表中,然后可以在代码中使用。
I'm not too sure how to do this. 我不太确定该怎么做。 I have been playing around with jSoup and currently am able to get each cell from the table to print out using a while loop - but obviously can't use this always as the number of cells in the table will change.
我一直在玩jSoup,目前可以使用while循环从表中获取每个单元格以进行打印-但显然不能始终使用此表,因为表中单元格的数量会发生变化。
Document doc = Jsoup.connect("http://www.sportinglife.com/greyhounds/abc-guide").get();
int n = 0;
while (n < 100){
Element tableHeader = doc.select("td").get(n);
for( Element element : tableHeader.children() )
{
// Here you can do something with each element
System.out.println(element.text());
}
n++;
}
Any idea of how I could do this? 关于我该怎么做的任何想法?
There are just a few things you have to implement to achieve your goal. 要实现目标,您只需执行几件事。 Take a look on this Groovy script - https://gist.github.com/wololock/568b9cc402ea661de546 Now lets explain what we have here
看看这个Groovy脚本-https: //gist.github.com/wololock/568b9cc402ea661de546现在让我们解释一下这里的内容
List<Element> rows = document.select('table[id=ABC Guide] > tbody > tr')
Here we're specifying that we are interested in every row tr
that is immediate child of tbody
which is immediate child of table
with id ABC Guide
. 在这里,我们指定对每个行
tr
感兴趣,该行tr
是tbody
直接子代, tbody
是ID为ABC Guide
的table
直接子代。 In return you receive a list of Element
objects that describes those tr
rows. 作为回报,您将收到描述这些
tr
行的Element
对象的列表。
Map<String, String> data = new HashMap<>()
We will store our result in a simple hash map for further evaluation eg putting those scraped data into the database. 我们将结果存储在一个简单的哈希图中,以进行进一步评估,例如将这些抓取的数据放入数据库中。
for (Element row : rows) {
String dog = row.select('td:eq(0)').text()
String race = row.select('td:eq(1)').text()
data.put(dog, race)
}
Now we iterate over every Element
and we select content as a text from the first cell: String dog = row.select('td:eq(0)').text()
and we repeat this step to retrieve the content as a text from the second cell: String race = row.select('td:eq(1)').text()
. 现在,我们遍历每个
Element
然后从第一个单元格中选择内容作为文本: String dog = row.select('td:eq(0)').text()
然后重复此步骤以文本形式检索内容从第二个单元格开始: String race = row.select('td:eq(1)').text()
。 Then we just simply put those data into the hash map. 然后,我们只需将这些数据放入哈希映射即可。 That's all.
就这样。
I hope this example with provided description will help you with developing your application. 希望本示例提供的描述将对您开发应用程序有所帮助。
EDIT: 编辑:
Java code sample - https://gist.github.com/wololock/8ccbc6bbec56ef57fc9e Java代码示例-https: //gist.github.com/wololock/8ccbc6bbec56ef57fc9e
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.