[英]Using JSoup Extract Specific Table (TableHeaderValue -> TableBodyValue) Contents
美好的一天,
我正在使用jsoup從表中提取數據
表格內容為
<table class="compare-products-table compare-products">
<thead>
<tr>
<th>
<p>GPSMAP</p>
</th>
<th>7x1</th>
<th>8x0/10x0</th>
<th>4000/5000</th>
<th>6000/7000</th>
<th>7400/7600</th>
<th>8000/8500</th>
</tr>
</thead>
<tbody>
<tr>
<td>Radar Overlay</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>Dual Range</td>
<td></td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>MARPA</td>
<td></td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>True Color</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>Auto Bird Gain</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td></td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>Echo Trails</td>
<td class="checked">•</td>
<td class="checked">•</td>
<td></td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>Pulse Expansion <span class="kicker pri sm">NEW</span></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>Dual Radar Support</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
<tr>
<td>Programmable antenna parking</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td class="checked">•</td>
<td class="checked">•</td>
</tr>
</tbody>
</table>
輸出值應該像
7x1 -> Radar Overlay: yes
8x0/10x0 -> Radar Overlay: yes
4000/5000 -> Radar Overlay: yes
6000/7000 -> Radar Overlay: yes
7400/7600 -> Radar Overlay: yes
8000/8500 -> Radar Overlay: yes
7x1 -> Dual Range: no
8x0/10x0 -> Dual Range: no
4000/5000 -> Dual Range: yes
6000/7000 -> Dual Range: yes
7400/7600 -> Dual Range: yes
8000/8500 -> Dual Range: yes
等等
我看過的示例尚不清楚如何獲得具有表屬性的內容
我得到的是atm:
Elements elementsFeatures = docProductsAttr.select("#featureTab"); // Feature
if (!elementsFeatures.isEmpty()) {
Elements selectThead = elementsFeatures.select(".compare-products thead tr th:gt(0)"); // get Table Head skipping 1st element
List<String> collectTableHead = selectThead.stream().map(i -> i.text()).collect(toList()); // collect head text value to List
Elements selectTbodyTr = elementsFeatures.select(".compare-products tbody tr"); // select Body tr to mix it with Head value
}
如果有人為我提供實現此目標所需的代碼,將不勝感激。
嘗試這個:
Elements elementsFeatures = docProductsAttr.select("#featureTab"); // Feature
if (!elementsFeatures.isEmpty()) {
for (Element row : elementsFeatures.select(".compare-products tbody tr")) {
Elements rowCells = row.select("td");
String gpsMap = rowCells.first().text();
int i = 1;
for (Element columnHeader : elementsFeatures.select(".compare-products thead tr th:gt(0)")) {
System.out.format("%s -> %s: %s%n", columnHeader.text(), gpsMap, yesOrNo(rowCells.get(i)));
i++;
}
System.out.println();
}
}
private static String yesOrNo(Element rowCell) {
String ret = "no";
if (rowCell.hasClass("checked")) {
ret = "yes";
}
return ret;
}
OUTPUT
7x1 -> Radar Overlay: yes
8x0/10x0 -> Radar Overlay: yes
4000/5000 -> Radar Overlay: yes
6000/7000 -> Radar Overlay: yes
7400/7600 -> Radar Overlay: yes
8000/8500 -> Radar Overlay: yes
7x1 -> Dual Range: no
8x0/10x0 -> Dual Range: no
4000/5000 -> Dual Range: yes
6000/7000 -> Dual Range: yes
7400/7600 -> Dual Range: yes
8000/8500 -> Dual Range: yes
...
細節
在CSS查詢.compare-products thead tr th:gt(0)
:gt(0)
表示“ 選擇所有th
除了第一個 ”。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.