简体   繁体   中英

Render HTML Webpage to text in Java

I would like to get the text representation of a website in a human -readable form, for example hyperlink locations or input fields.
Is there any library that does this? (I've checked Jericho Renderer but it does not show input fields)
For example

<div>
<form action="example.php">
Name:
<input type="text" name="name_field">
<input type="button" value="OK">
</form>
</div>

to something like this

Name: [________] [OK]

Try tag soup and build it yourself. You get a DOM model of the HTML and can spit out the text.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM