[英]Using JSoup Extract Specific Table (TableHeaderValue -> TableBodyValue) Contents
Good day, 美好的一天,
I am using jsoup to extract data from a table 我正在使用jsoup从表中提取数据
Table Content is 表格内容为
<table class="compare-products-table compare-products">
<thead>
<tr>
<th>
<p>GPSMAP</p>
</th>
<th>7x1</th>
<th>8x0/10x0</th>
<th>4000/5000</th>
<th>6000/7000</th>
<th>7400/7600</th>
<th>8000/8500</th>
</tr>
</thead>
<tbody>
<tr>
<td>Radar Overlay</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>Dual Range</td>
<td></td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>MARPA</td>
<td></td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>True Color</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>Auto Bird Gain</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td></td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>Echo Trails</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td></td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>Pulse Expansion <span class="kicker pri sm">NEW</span></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>Dual Radar Support</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>Programmable antenna parking</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
</tbody>
</table>
Output value should be like 输出值应该像
7x1 -> Radar Overlay: yes
8x0/10x0 -> Radar Overlay: yes
4000/5000 -> Radar Overlay: yes
6000/7000 -> Radar Overlay: yes
7400/7600 -> Radar Overlay: yes
8000/8500 -> Radar Overlay: yes
7x1 -> Dual Range: no
8x0/10x0 -> Dual Range: no
4000/5000 -> Dual Range: yes
6000/7000 -> Dual Range: yes
7400/7600 -> Dual Range: yes
8000/8500 -> Dual Range: yes
etc 等等
examples I've seen aren't too clear on how to get the contents if it has attribute of the table 我看过的示例尚不清楚如何获得具有表属性的内容
what i get atm : 我得到的是atm:
Elements elementsFeatures = docProductsAttr.select("#featureTab"); // Feature
if (!elementsFeatures.isEmpty()) {
Elements selectThead = elementsFeatures.select(".compare-products thead tr th:gt(0)"); // get Table Head skipping 1st element
List<String> collectTableHead = selectThead.stream().map(i -> i.text()).collect(toList()); // collect head text value to List
Elements selectTbodyTr = elementsFeatures.select(".compare-products tbody tr"); // select Body tr to mix it with Head value
}
Will be appreciated if someone provide me with the code required to achieve this. 如果有人为我提供实现此目标所需的代码,将不胜感激。
Try this: 尝试这个:
Elements elementsFeatures = docProductsAttr.select("#featureTab"); // Feature
if (!elementsFeatures.isEmpty()) {
for (Element row : elementsFeatures.select(".compare-products tbody tr")) {
Elements rowCells = row.select("td");
String gpsMap = rowCells.first().text();
int i = 1;
for (Element columnHeader : elementsFeatures.select(".compare-products thead tr th:gt(0)")) {
System.out.format("%s -> %s: %s%n", columnHeader.text(), gpsMap, yesOrNo(rowCells.get(i)));
i++;
}
System.out.println();
}
}
private static String yesOrNo(Element rowCell) {
String ret = "no";
if (rowCell.hasClass("checked")) {
ret = "yes";
}
return ret;
}
OUTPUT OUTPUT
7x1 -> Radar Overlay: yes
8x0/10x0 -> Radar Overlay: yes
4000/5000 -> Radar Overlay: yes
6000/7000 -> Radar Overlay: yes
7400/7600 -> Radar Overlay: yes
8000/8500 -> Radar Overlay: yes
7x1 -> Dual Range: no
8x0/10x0 -> Dual Range: no
4000/5000 -> Dual Range: yes
6000/7000 -> Dual Range: yes
7400/7600 -> Dual Range: yes
8000/8500 -> Dual Range: yes
...
Details 细节
In the CSS query .compare-products thead tr th:gt(0)
, :gt(0)
means " select all th
except the first one ". 在CSS查询
.compare-products thead tr th:gt(0)
:gt(0)
表示“ 选择所有th
除了第一个 ”。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.