简体   繁体   中英

Extract <data-id > from HTML with JSoup

I'm new in Jsoup an i'm trying to scrap some datas from website using Jsoup. I want to extract only datas under specific <data-id> node. this is the webpage structure

<tr data-id="13">
  <td class="th">Dimension</td>
  <td class="l">152.5x82x9.8mm (6x3.23x0.39")</td>
</tr>
<tr class="even" data-id="15">
  <td class="th">Weight</td>
  <td class="l">190gr (6.7oz)</td>
</tr>
<tr class="h" data-id="116">
   <td class="th">Ringtone</td>
   <td class="l"></td>
</tr>

I need to get something like this

  1. Dimension
  2. 190gr
  3. Ringtone Please Help me.

This is what i tried

 public static void main(String[] args) throws Exception{

    final Document document =  Jsoup.connect("url").get();

    String testString = document.select("table")
            .select("tbody").select(".even")
            .select("td.l").text();
    System.out.println("the tile is "+testString);
    }
}

This is the output: 152.5x82x9.8mm (6x3.23x0.39") 190gr and so on

    Document out = Jsoup.connect("https://www.phonegg.com/phone/9858-Energizer-Power-Max-P600s-32GB/%22")
            .timeout(15000).get();
    String dimension = out.getElementsByAttributeValue("data-id", "13").get(0).getElementsByClass("l").text();
    String weight = out.getElementsByAttributeValue("data-id", "15").get(0).getElementsByClass("l").text();
    String rington = out.getElementsByAttributeValue("data-id", "116").get(0).getElementsByClass("l").text();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM