简体   繁体   中英

Using jsoup to get data from first column of table

For the purpose of my question, I have created a simple HTML page, an extract of which is the following:

<table class="fruit-vegetables">
  <thead>
    <th>Fruit</th>
    <th>Vegetables</th>
  </thead>
  <tbody>
    <tr>
      <td>
        <b>
          <a href="https://en.wikipedia.org/wiki/Apple" title="Apples">Apples</a>
        </b>
      </td>
      <td>
        <a href="https://en.wikipedia.org/wiki/Carrot" title="Carrots">Carrots</a>
      </td>
    </tr>
    <tr>
      <td>
        <i>
          <a href="https://en.wikipedia.org/wiki/Orange_%28fruit%29" title="Oranges">Oranges</a>
        </i>
      </td>
      <td>
        <a href="https://en.wikipedia.org/wiki/Pea" title="Peas">Peas</a>
      </td>
    </tr>
  </tbody>
</table>

I want to extract the data from the first column called "Fruit" using Jsoup. Thus, the result should be:

Apples
Oranges

I have written a program, an extract of which is the following:

//In reality, it should be connect(html).get(). 
//Also, suppose that the String `html` has the full source code.
Document doc = Jsoup.parse(html); 

Elements table = doc.select("table.fruit-vegetables").select("tbody").select("tr").select("td").select("a");

for(Element element : table){
    System.out.println(element.text());
}

The result of this program is:

Apples
Carrots
Oranges
Peas

I know that something is not working good, but I can't find my mistake. All the other questions here in Stack Overflow did not solve my problem. What do I have to do?

You seems to be looking for

Elements el = doc.select("table.fruit-vegetables td:eq(0)");
for (Element e : el){
    System.out.println(e.text());
}

From http://jsoup.org/cookbook/extracting-data/selector-syntax you can find description of :eq(n) as

:eq(n) : find elements whose sibling index is equal to n ; eg form input:eq(1)

So with td:eq(0) we are selecting each <td> which is first child of its parent - in this case <tr> .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM