I'm attempting to parse a HTML document using JSoup. What I am trying to do is extract the table data of a specific row. I want to be able to select said row using the value of the href attribute or the value of the <a></a>
tags.
<tbody>
<tr class="even">
<td><a href="link-1">Link_1</a></td>
<td align="center">9</td>
<td align="center">9</td>
<td align="center">2</td>
</tr>
<tr class="odd">
<td><a href="link-2">Link_2</a></td>
<td align="center">22</td>
<td align="center">4</td>
<td align="center">1</td>
</tr>
<tr class="even">
<td><a href="link-3">Link_3</a></td>
<td align="center">22</td>
<td align="center">7</td>
<td align="center">1</td>
</tr>
</tbody>
Selecting the whole table is easy, I can just use the following:
Document htmlRawData = Jsoup.parse(deviceMetricData.toString());
Elements htmlMetrics = htmlRawData.select("tbody > tr > td[align]");
htmlMetrics.stream().forEach((ele) -> {
System.out.println(ele.toString());
});
This is only ever ideal when the table has a single row. If it has many then selecting a specific row based on the value of the first cell becomes more tricky.
Can anyone help get me started or point me in the right direction?
Remember that can traverse through DOM
tree.
If you only know that there will be always the same structure ( a
inside td
which is inside tr
) then you can make it as follows:
Element link = document.select("tbody > tr > td > a[href=\"link-1\"]").first();
link.parent().parent().children().forEach(System.out::println);
You can also filter all rows by occurence of this very href
value:
final Elements rows = document.select("tbody > tr");
rows
.stream()
.filter(tr -> !tr.getElementsByAttributeValueMatching("href", "link-1").isEmpty())
.findFirst()
.map(Element::children)
.ifPresent(System.out::println);
Or by using select:
final Elements rows = document.select("tbody > tr");
rows
.stream()
.filter(tr -> !tr.select("a[href=\"link-1\"").isEmpty())
.findFirst()
.map(Element::children)
.ifPresent(System.out::println);
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.