简体   繁体   中英

How to get nested element using jSoup?

I am trying to access the nest class gwt-HTML from http://folkets-lexikon.csc.kth.se/folkets/#lookup&dricker&0 , which contains the following text:

Böjningar: drack, druckit, drick, dricka, dricker

Some quick, relevant information about the above site: it is an English-Swedish dictionary, where I all I need to do it just slightly modifiy the URL each time and then grab the text that follows after the word Böjningar , in this case I would get 'drack, druckit, drick, dricka, dricker'

Here is what I have tried so far

Document document = Jsoup.connect("http://folkets-lexikon.csc.kth.se/folkets/#lookup&dricker&0").get();
Elements elements = document.getElementsByClass("gwt-HTML");
if(!elements.isEmpty()){
    for(Element element: elements){
        System.out.println(element.data());
    }
} else {
    System.out.println("***********NO RESULTS !!!");
}

With the above code, I keep entering the else statement, even though when I inspect the elements of the site, I can see

<div class="gwt-HTML">Böjningar: drack, druckit, drick, dricka, dricker</div>

How can I gain access to this element?

Here is a screenshot of the data 在此处输入图片说明

Use select("div.gwt-HTML") instead of getElementsByClass("gwt-HTML")

Document document = Jsoup.connect("http://folkets-lexikon.csc.kth.se/folkets/#lookup&dricker&0").get();
Elements elements = document.select("div.gwt-HTML");
if(!elements.isEmpty()){
    for(Element element: elements){
        System.out.println(element.data());
    }
} else {
    System.out.println("***********NO RESULTS !!!");
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM