简体   繁体   中英

How to loop through this query using Jsoup?

I want to loop through the news table and get the title and rating of each row. I tried different options, but I can't understand why the select method receives all the options at once. I need to get each news block in a loop.

I used this way to get table link: Elements elements = document.select("#hnmain > tbody > tr:nth-child(3) > td > table");

This query doesn't work in a loop because it gets all the elements at once. I need to get the elements sequentially. So that I can do like this: List list = new ArrayList<>();

for (Element element: elements){
     String title = element...
     String rating = element...
     list.add(title);
     list.add(rating);
}

Sample data from html:

<table border="0" cellpadding="0" cellspacing="0">
 <tbody>
  <tr class="athing" id="33582264">
   <td align="right" valign="top" class="title"><span class="rank">1.</span></td>
   <td valign="top" class="votelinks">
    <center>
     <a id="up_33582264" href="vote?id=33582264&amp;how=up&amp;goto=front%3Fday%3D2022-11-13">
      <div class="votearrow" title="upvote"></div></a>
    </center></td>
   <td class="title"><span class="titleline"><a href="https://upbase.io/">Show HN: I built my own PM tool after trying Trello, Asana, ClickUp, etc.</a><span class="sitebit comhead"> (<a href="from?site=upbase.io"><span class="sitestr">upbase.io</span></a>)</span></span></td>
  </tr>
  <tr>
   <td colspan="2"></td>
   <td class="subtext"><span class="subline"> <span class="score" id="score_33582264">632 points</span> by <a href="user?id=tonypham" class="hnuser">tonypham</a> <span class="age" title="2022-11-13T12:00:06"><a href="item?id=33582264">20 days ago</a></span> <span id="unv_33582264"></span> | <a href="hide?id=33582264&amp;goto=front%3Fday%3D2022-11-13">hide</a> | <a href="item?id=33582264">456&nbsp;comments</a> </span></td>
  </tr>
  <tr class="spacer" style="height:5px"></tr>
  <tr class="athing" id="33584941">
   <td align="right" valign="top" class="title"><span class="rank">2.</span></td>
   <td valign="top" class="votelinks">
    <center>
     <a id="up_33584941" href="vote?id=33584941&amp;how=up&amp;goto=front%3Fday%3D2022-11-13">
      <div class="votearrow" title="upvote"></div></a>
    </center></td>
   <td class="title"><span class="titleline"><a href="https://fathy.fr/html2svg">Forking Chrome to turn HTML into SVG</a><span class="sitebit comhead"> (<a href="from?site=fathy.fr"><span class="sitestr">fathy.fr</span></a>)</span></span></td>
  </tr>

if I understand your question I think this code will work for you

Document doc = Jsoup.parse("<table border=\"0\" id=\"hnmain\" cellpadding=\"0\" cellspacing=\"0\"> <tbody> <tr class=\"athing\" id=\"33582264\"> <td align=\"right\" valign=\"top\" class=\"title\"><span class=\"rank\">1.</span></td> <td valign=\"top\" class=\"votelinks\"> <center> <a id=\"up_33582264\" href=\"vote?id=33582264&amp;how=up&amp;goto=front%3Fday%3D2022-11-13\"> <div class=\"votearrow\" title=\"upvote\"></div></a> </center></td> <td class=\"title\"><span class=\"titleline\"><a href=\"https://upbase.io/\">Show HN: I built my own PM tool after trying Trello, Asana, ClickUp, etc.</a><span class=\"sitebit comhead\"> (<a href=\"from?site=upbase.io\"><span class=\"sitestr\">upbase.io</span></a>)</span></span></td> </tr> <tr> <td colspan=\"2\"></td> <td class=\"subtext\"><span class=\"subline\"> <span class=\"score\" id=\"score_33582264\">632 points</span> by <a href=\"user?id=tonypham\" class=\"hnuser\">tonypham</a> <span class=\"age\" title=\"2022-11-13T12:00:06\"><a href=\"item?id=33582264\">20 days ago</a></span> <span id=\"unv_33582264\"></span> | <a href=\"hide?id=33582264&amp;goto=front%3Fday%3D2022-11-13\">hide</a> | <a href=\"item?id=33582264\">456&nbsp;comments</a> </span></td> </tr> <tr class=\"spacer\" style=\"height:5px\"></tr> <tr class=\"athing\" id=\"33584941\"> <td align=\"right\" valign=\"top\" class=\"title\"><span class=\"rank\">2.</span></td> <td valign=\"top\" class=\"votelinks\"> <center> <a id=\"up_33584941\" href=\"vote?id=33584941&amp;how=up&amp;goto=front%3Fday%3D2022-11-13\"> <div class=\"votearrow\" title=\"upvote\"></div></a> </center></td> <td class=\"title\"><span class=\"titleline\"><a href=\"https://fathy.fr/html2svg\">Forking Chrome to turn HTML into SVG</a><span class=\"sitebit comhead\"> (<a href=\"from?site=fathy.fr\"><span class=\"sitestr\">fathy.fr</span></a>)</span></span></td> </tr>");
    Elements elements = doc.select("#hnmain .athing");
    for (Element element : elements) {
        String title = element.select(".title").text();
        String rank = element.select(".rank").text();
        
        System.out.println(title + " -- "+rank);
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM