简体   繁体   中英

How to fetch data from ul li in android studio with jsoup

I am trying to get Product 1 and Product 2 but I cant get it help please I am using jsoup and volley

<ul id="searched-products">
    <li>
        <div class="gd-col navUnitContainer1 gu4">
            <div class="product_name">
                <a>Prodict 1</a>
            </div>
        </div>
    </li>
    <li>
        <div class="gd-col navUnitContainer1 gu4">
            <div class="product_name">
                <a>Prodict 2</a>
            </div>
        </div>
    </li>
</ul>

I have tried this

Elements itemElements = doc.select("ul#searched-products li");

but its not selecting "li".I have also tried this

Elements itemElements = doc.select("ul#searched-products"); //this line works
Element e1 = itemElements.get(i);
e1.select("li"); or item.getElementsByTag("li");

still no good... There are hundreds of li so I cant do this

doc.select("li");

Kindly suggest something

Like this:

public class JsoupList {

public static void main(String[] brawwwr){
    String html = "<ul id=\"searched-products\">" +
"<li>" +
    "<div class=\"gd-col navUnitContainer1 gu4\">" +
        "<div class=\"product_name\">" +
            "<a>Prodict 1</a>" + 
        "</div>" + 
    "</div>" +
"</li>" +
"<li>" +
    "<div class=\"gd-col navUnitContainer1 gu4\">" +
        "<div class=\"product_name\">"+
            "<a>Prodict 2</a>" +
        "</div>" +
    "</div>" +
"</li>" +
"</ul>";

    Document doc = Jsoup.parse(html);

    Elements itemElements = doc.select("ul#searched-products li");

    for(Element elem : itemElements){

        System.out.println(elem.select("div div a").text());
    }
}

}

Will return

Prodict 1

Prodict 2

You can imagine repetitive code inside tags like a little page of its own. regards

Try this code.

Elements itemElements = doc.select("ul#searched-products");
itemElements = itemElements.select("li");
for(Element ele : itemElements){
String text = ele.text();
System.out.println(text); //this will return Prodict 1 and Prodict 2 
}

//  or u can try by getting all the a

for(Element ele : itemElements){
 String text = ele.select("a").first().text();
 System.out.println(text); //this will also return Prodict 1 and Prodict 2
}

To exclude <li> or <a> tags outside the list, you need to restrict the selector to match only inside the list. The best would be to use the ID ( #searched-products ). Then do not select <li> or <a> tags from the doc , but from the selected <ul> element.

You can get your text with any of the following selectors (not a complete list):

  • #searched-products li a
  • #searched-products a
  • #searched-products .product_name a
  • #searched-products .product_name

Even the last one is okay, since you need only the text, and div.product_name contains only the <a> tag.

for(Element e: doc.select("#searched-products .product_name")) {
    String t = e.text(); // Prodict N
}

By the way, your original approach with selecting <li> tags inside ul#searched-products should have worked. If that doesn't return anything, the case might be that the list is generated dynamically on that page. You can test it easily by printing out the HTML that Jsoup has ( doc.html() or doc.select('#searched-products').html() ).

If really that's the case, Jsoup is not the right tool for you. I suggest you to use Selenium with possibly a headless browser ( HtmlUnit or PhantomJS ). They can return and even interact with dynamically created elements, so maybe other parts of your crawl process can be simplified.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM