[英]Is there a Standard Java SE HTML Parser? If so, why use non-standard ones?
I need to parse a simple HTML page with a simple form in it. 我需要解析一个简单的HTML页面,其中包含一个简单的表单。 The answers to similar questions on StackOverflow suggest using one of a large variety of non-standard Java libraries such as TagSoup, JSoup, HTMLParser and many others.
StackOverflow上类似问题的答案建议使用各种非标准Java库之一,如TagSoup,JSoup,HTMLParser等等。
However, a web search revealed that there exists some standard functionality in Java SE via this class: http://docs.oracle.com/javase/7/docs/api/javax/swing/text/html/parser/ParserDelegator.html 但是,网络搜索显示Java SE中存在一些标准功能: http : //docs.oracle.com/javase/7/docs/api/javax/swing/text/html/parser/ParserDelegator.html
My sub-questions are: 我的子问题是:
Thank you. 谢谢。
JDK has built-in HTML parser that supports HTML 1.0 or so. JDK内置HTML解析器,支持HTML 1.0左右。 It should support parsing of base text formatting tags and forms.
它应该支持解析基本文本格式标签和表单。
The reason to use other, third party parsers is requirement to support "real" HTML pages DHTML, JavaScript etc. 使用其他第三方解析器的原因是需要支持“真实”HTML页面DHTML,JavaScript等。
JSoup is one of popular parsers that can do the job. JSoup是可以完成这项工作的流行解析器之一。 For more information about other implementations please take a look on the following discussion:
有关其他实现的更多信息,请查看以下讨论:
Pure Java HTML viewer/renderer for use in a Scrollable pane 用于Scrollable窗格的纯Java HTML查看器/渲染器
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.