简体   繁体   English

使用JSoup提取特定表(TableHeaderValue-> TableBodyValue)的内容

[英]Using JSoup Extract Specific Table (TableHeaderValue -> TableBodyValue) Contents

Good day, 美好的一天,

I am using jsoup to extract data from a table 我正在使用jsoup从表中提取数据

Table Content is 表格内容为

<table class="compare-products-table compare-products">
    <thead>
        <tr>
            <th>
            <p>GPSMAP</p>
            </th>
            <th>7x1</th>
            <th>8x0/10x0</th>
            <th>4000/5000</th>
            <th>6000/7000</th>
            <th>7400/7600</th>
            <th>8000/8500</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>Radar Overlay</td>
            <td class="checked">•</td>
            <td class="checked">•</td>
            <td class="checked">•</td>
            <td class="checked">•</td>
            <td class="checked">•</td>
            <td class="checked">•</td>
        </tr>
        <tr>
            <td>Dual Range</td>
            <td></td>
            <td></td>
            <td class="checked">•</td>
            <td class="checked">•</td>
            <td class="checked">•</td>
            <td class="checked">•</td>
        </tr>
        <tr>
            <td>MARPA</td>
            <td></td>
            <td></td>
            <td class="checked">•</td>
            <td class="checked">•</td>
            <td class="checked">•</td>
            <td class="checked">•</td>
        </tr>
        <tr>
            <td>True Color</td>
            <td class="checked">•</td>
            <td class="checked">•</td>
            <td></td>
            <td class="checked">•</td>
            <td class="checked">•</td>
            <td class="checked">•</td>
        </tr>
        <tr>
            <td>Auto Bird Gain</td>
            <td class="checked">•</td>
            <td class="checked">•</td>
            <td></td>
            <td></td>
            <td class="checked">•</td>
            <td class="checked">•</td>
        </tr>
        <tr>
            <td>Echo Trails</td>
            <td class="checked">•</td>
            <td class="checked">•</td>
            <td></td>
            <td></td>
            <td class="checked">•</td>
            <td class="checked">•</td>
        </tr>
        <tr>
            <td>Pulse Expansion <span class="kicker pri sm">NEW</span></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td class="checked">•</td>
            <td class="checked">•</td>
        </tr>
        <tr>
            <td>Dual Radar Support</td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td class="checked">•</td>
            <td class="checked">•</td>
        </tr>
        <tr>
            <td>Programmable antenna parking</td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td class="checked">•</td>
            <td class="checked">•</td>
        </tr>
    </tbody>
</table>

Output value should be like 输出值应该像

7x1 -> Radar Overlay: yes
8x0/10x0 -> Radar Overlay: yes
4000/5000 -> Radar Overlay: yes
6000/7000 -> Radar Overlay: yes
7400/7600 -> Radar Overlay: yes
8000/8500 -> Radar Overlay: yes

7x1 -> Dual Range: no
8x0/10x0 -> Dual Range: no
4000/5000 -> Dual Range: yes
6000/7000 -> Dual Range: yes
7400/7600 -> Dual Range: yes
8000/8500 -> Dual Range: yes

etc 等等

examples I've seen aren't too clear on how to get the contents if it has attribute of the table 我看过的示例尚不清楚如何获得具有表属性的内容

what i get atm : 我得到的是atm:

       Elements elementsFeatures = docProductsAttr.select("#featureTab"); // Feature
        if (!elementsFeatures.isEmpty()) {
            Elements selectThead = elementsFeatures.select(".compare-products thead tr th:gt(0)"); // get Table Head skipping 1st element
            List<String> collectTableHead = selectThead.stream().map(i -> i.text()).collect(toList()); // collect head text value to List
            Elements selectTbodyTr = elementsFeatures.select(".compare-products tbody tr"); // select Body tr to mix it with Head value
        }

Will be appreciated if someone provide me with the code required to achieve this. 如果有人为我提供实现此目标所需的代码,将不胜感激。

Try this: 尝试这个:

Elements elementsFeatures = docProductsAttr.select("#featureTab"); // Feature
if (!elementsFeatures.isEmpty()) {
    for (Element row : elementsFeatures.select(".compare-products tbody tr")) {
        Elements rowCells = row.select("td");
        String gpsMap = rowCells.first().text();
        int i = 1;

        for (Element columnHeader : elementsFeatures.select(".compare-products thead tr th:gt(0)")) {
            System.out.format("%s -> %s: %s%n", columnHeader.text(), gpsMap, yesOrNo(rowCells.get(i)));
            i++;
        }
        System.out.println();
    }
}

private static String yesOrNo(Element rowCell) {
    String ret = "no";
    if (rowCell.hasClass("checked")) {
        ret = "yes";
    }
    return ret;
}

OUTPUT OUTPUT

7x1 -> Radar Overlay: yes
8x0/10x0 -> Radar Overlay: yes
4000/5000 -> Radar Overlay: yes
6000/7000 -> Radar Overlay: yes
7400/7600 -> Radar Overlay: yes
8000/8500 -> Radar Overlay: yes

7x1 -> Dual Range: no
8x0/10x0 -> Dual Range: no
4000/5000 -> Dual Range: yes
6000/7000 -> Dual Range: yes
7400/7600 -> Dual Range: yes
8000/8500 -> Dual Range: yes
...

Details 细节

In the CSS query .compare-products thead tr th:gt(0) , :gt(0) means " select all th except the first one ". 在CSS查询.compare-products thead tr th:gt(0) :gt(0)表示“ 选择所有th除了第一个 ”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM