[英]How do I convert a document made in Jsoup (the Java html parser) into a string
I have a document that was made in jsoup that looks like this 我有一个jsoup中的文档,看起来像这样
Document doc = Jsoup.connect("http://en.wikipedia.org/").get();
How do i convert that doc
into a string. 如何将该
doc
转换为字符串。
Have you tried: 你有没有尝试过:
Document doc = Jsoup.connect("http://en.wikipedia.org/").get();
String htmlString = doc.toString();
As Document extends Element it also has got the method html() which "Retrieves the element's inner HTML" according to the API . 随着Document扩展Element ,它还获得了根据API获取 “检索元素的内部HTML”的方法html() 。 So that should work:
这应该工作:
Document doc = Jsoup.connect("http://en.wikipedia.org/").get();
String htmlString = doc.html();
Additional Info: 附加信息:
Each Document object has got a reference to an instance of the inner class Document.OutputSettings which can be accessed via the method outputSettings() of Document. 每个Document对象都有一个对内部类Document.OutputSettings实例的引用,可以通过Document的outputSettings()方法访问它。 There you can enable/disable pretty-printing by using the setter prettyPrint(true/false) .
在那里,您可以使用setter prettyPrint(true / false)启用/禁用漂亮打印。 See the API for Document and Document.OutputSettings for furtherinformation
有关详细信息,请参阅API for Document和Document.OutputSettings
doc.toString()
和doc.outerHtml()
。
Document doc = Jsoup.connect("http://en.wikipedia.org/").get();
Elements post = doc.select("div.post-content");
String dd = post.toString();
Document ddd = Jsoup.parse(dd);
After parsing the string to document then you can use on it document functions 在将字符串解析为文档之后,您可以使用它来处理文档功能
Elements scriptTag = ddd.getElementsByTag("script");
System.out.println(scriptTag);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.