简体   繁体   中英

how can I fetch outer div text only with JSoup?

I have the following html code:

<div class="description">
    <div class='daterange'>
        Hello 
     <span itemprop='startDate'>
        June 3, 2011
     </span>
    </div>
    This is some description <i>that</i> I want to fetch
 </div><br/>

and I want to extract only the part:

This is some description <i>that</i> I want to fetch

How can I do it with jsoup?

I tried using String description = doc.select("div.description").text() but then I'm getting all content that's inside.

what you need is creating a String which will hold the words of the html file. this is made by the following code, doc.body().text() is taking the text without all the html tags.

`public String getWords(String url) {
        String text = "";
        try {
            Document doc = Jsoup.connect(url).get();
            text = doc.body().text();
        } catch (IOException ioe) {
            ioe.printStackTrace();
        }
        return text;
    }
`

尝试这个

String description = doc.select("div").remove().first().html();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM