简体   繁体   中英

Extract data from HTML table with Jsoup

I want to extract information from this table:

在此处输入图像描述

HTML code for this table:

<tr>
 <th>Rank</th>
 <th>Level</th>
 <th>IVs (A/D/S)</th>
 <th>CP</th>
 <th class="hidden-sm">Att</th>
 <th class="hidden-sm">Def</th>
 <th class="hidden-sm">Sta</th>
 <th class="hidden-xs">Stat Product</th>
 <th>% Max Stat</th>
</tr>
<tr class="table-danger">
 <td><b>2997</b></td>
 <td>19.0</td>
 <td>12 / 0 / 5</td>
 <td>1496</td>
 <td class="hidden-sm">128.10</td>
 <td class="hidden-sm">101.90</td>
 <td class="hidden-sm">133</td>
 <td class="hidden-xs">1736099</td>
 <td>93.71%</td>
</tr>
<tr>
 <td>1</td>
 <td>19.0</td>
 <td>0 / 14 / 14</td>
 <td>1498</td>
 <td class="hidden-sm">121.11</td>
 <td class="hidden-sm">110.05</td>
 <td class="hidden-sm">139</td>
 <td class="hidden-xs">1852687</td>
 <td>100.00%</td>
</tr>
...

I only can get this table and rows with this code:

Element table = document.select("table").get(0);
Elements rows = table.select("tr");

How to extract these stats? It should be:

Rank(2997) | Level (19.0) | IVs (12/0/5) | CP (1496)...

With

Elements td = rows.select("td");
String stats = td.text();

I'll get one-line string: 2997 19.0 12 / 0 / 5 1496 128.10 101.90 133 1736099 93.71% 1 19.0 0... and it's hard to work with information.

I guess, I need to store them as Stat object with these fields and put it into Arraylist or smth.

But firstly, I need to extract this data more smoothly and don't put everything on one line. I need the power of Jsoup.

You were on the right track, but did not reach the end. Elements is a regular ArrayList that can be looped through.
Let's write the class Stat. Objects of this class will store the data of each row. You can also write getters, setters, and other methods for your business logic:

public class Stat {
    private String rank;
    private String level;
    private String ivs;
    private String cp;
    private String att;
    private String def;
    private String sta;
    private String statProduct;
    private String maxStat;

    public Stat(String rank, String level, String ivs, String cp, String att, String def, String sta, String statProduct, String maxStat) {
        this.rank = rank;
        this.level = level;
        this.ivs = ivs;
        this.cp = cp;
        this.att = att;
        this.def = def;
        this.sta = sta;
        this.statProduct = statProduct;
        this.maxStat = maxStat;
    }

    @Override
    public String toString() {
        return "Stat{" +
                "rank='" + rank + '\'' +
                ", level='" + level + '\'' +
                ", ivs='" + ivs + '\'' +
                ", cp='" + cp + '\'' +
                ", att='" + att + '\'' +
                ", def='" + def + '\'' +
                ", sta='" + sta + '\'' +
                ", statProduct='" + statProduct + '\'' +
                ", maxStat='" + maxStat + '\'' +
                '}';
    }
}

It remains only to loop through the array. Continuation of your code:

Elements rows = table.select("tr");

            for (int i = 0; i < rows.size(); i++) {
                Element row = rows.get(i);
                Elements td = t.getAllElements();
                Stat stat = new Stat(
                        td.get(1).text(),
                        td.get(2).text(),
                        td.get(3).text(),
                        td.get(4).text(),
                        td.get(5).text(),
                        td.get(6).text(),
                        td.get(7).text(),
                        td.get(8).text(),
                        td.get(9).text()
                );
                
                System.out.println(stat);
            }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM