如何将Jsoup（Java html解析器）中生成的文档转换为字符串

Question

I have a document that was made in jsoup that looks like this 我有一个jsoup中的文档，看起来像这样

Document doc = Jsoup.connect("http://en.wikipedia.org/").get();

How do i convert that doc into a string. 如何将该doc转换为字符串。

Answer 1

Have you tried: 你有没有尝试过：

Document doc = Jsoup.connect("http://en.wikipedia.org/").get();
String htmlString = doc.toString();

As Document extends Element it also has got the method html() which "Retrieves the element's inner HTML" according to the API . 随着Document扩展Element ，它还获得了根据API获取 “检索元素的内部HTML”的方法html（） 。 So that should work: 这应该工作：

Document doc = Jsoup.connect("http://en.wikipedia.org/").get();
String htmlString = doc.html();

Additional Info: 附加信息：

Each Document object has got a reference to an instance of the inner class Document.OutputSettings which can be accessed via the method outputSettings() of Document. 每个Document对象都有一个对内部类Document.OutputSettings实例的引用，可以通过Document的outputSettings（）方法访问它。 There you can enable/disable pretty-printing by using the setter prettyPrint(true/false) . 在那里，您可以使用setter prettyPrint（true / false）启用/禁用漂亮打印。 See the API for Document and Document.OutputSettings for furtherinformation 有关详细信息，请参阅API for Document和Document.OutputSettings

Answer 2

doc.toString()和doc.outerHtml() 。

Answer 3

 Document doc = Jsoup.connect("http://en.wikipedia.org/").get();     
 Elements post = doc.select("div.post-content");
 String dd = post.toString();
 Document ddd = Jsoup.parse(dd);

After parsing the string to document then you can use on it document functions 在将字符串解析为文档之后，您可以使用它来处理文档功能

 Elements scriptTag = ddd.getElementsByTag("script");
 System.out.println(scriptTag);

如何将Jsoup（Java html解析器）中生成的文档转换为字符串

问题描述

3 个解决方案

解决方案1
36 已采纳 2011-07-28 20:17:59

解决方案2
8 2011-07-28 20:20:25

解决方案3
0 2014-09-03 03:10:47

如何将Jsoup（Java html解析器）中生成的文档转换为字符串

问题描述

3 个解决方案

解决方案1 36 已采纳 2011-07-28 20:17:59

解决方案2 8 2011-07-28 20:20:25

解决方案3 0 2014-09-03 03:10:47

解决方案1
36 已采纳 2011-07-28 20:17:59

解决方案2
8 2011-07-28 20:20:25

解决方案3
0 2014-09-03 03:10:47