简体   繁体   English

使用jsoup解析html表

[英]Parsing html table using jsoup

I am trying to parse HTML table using jsoup. 我正在尝试使用jsoup解析HTML表。 I am new to jsoup and I have read some tutorial on it. 我是jsoup的新手,并且已经阅读了一些教程。 I need to extract values from table for each column from this website: https://www.basketball-reference.com/boxscores/pbp/201905160GSW.html . 我需要从此网站的表中为每个列提取值: https : //www.basketball-reference.com/boxscores/pbp/201905160GSW.html I tried getting every timestamps, but it's only printing a single element. 我尝试获取每个时间戳,但这只是打印单个元素。 This is the code I tried last. 这是我上次尝试的代码。

Document doc = Jsoup.connect("https://www.basketball-reference.com/boxscores/pbp/201905160GSW.html").get();         
Elements trs = doc.select("table");

for(Element tr : trs) {
    Elements tds = tr.getElementsByTag("td");
    Element td = tds.get(0);
    System.out.println(td.text());
}

Do you understand your code? 你懂你的代码吗?

It selects all tables and puts them into trs variable: 它选择所有表并将它们放入trs变量:

Elements trs = doc.select("table");

Then it iterates over each table: 然后遍历每个表:

for(Element tr : trs) {

From each table it selects all cells 'td': 从每个表中选择所有单元格“ td”:

Elements tds = tr.getElementsByTag("td");

then it takes only the first cell 那么只需要第一个单元格

Element td = tds.get(0);

and prints its contents 并打印其内容

System.out.println(td.text());

Some of these actions are not necessary but now with that explanation you should have a good start. 这些动作中的某些动作不是必需的,但是现在有了这个解释,您应该有一个好的开始。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM