简体   繁体   中英

Don't want the document structure to be created while parsing through JSOUP

I am using a Jsoup API to parse a part of HTML using Jsoup.parse() method. However, while parsing it adds the document structure to the HTML content.

For Example:

<p><a href="some link">some link data</a> Some paragraph content</p>

Becomes

<html>
<head></head>
<body>
<p><a href="some link">some link data</a> Some paragraph content</p>
</body>
</html>

I don't want the document structure after parsing (I don't want the html, head, body tags). Is there any way to do it? Thanks in advance.

SOLUTION

I have use the body() and html() method of the Document

Document storyBodyDoc;
storyBodyDoc = Jsoup.parse(body);
storyBodyDoc.body().html()

Thanks for the suggestion.

You could select the children of the body-element:

Document doc = Jsoup.parse("<p><a href=\"some link\">some link data</a> Some paragraph content</p>");
Elements content = doc.body().children();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM