简体   繁体   中英

Get encoded html content only from url in java

Is there a library in JAVA where I can encode HTML, but only content?

I have like

<div>Tél</div>

and I only want

<div>T&eacute;l</div>

instead of

&lt;div&gt;T&eacute;l<&lt;/div&gt;

I need this library to encode an entire HTML. I have tried library JSoup but it has bugs when handling some objects.

Thanks

It's never a good idea to parse HTML using regex, that's a recipe for disaster.

So first look at this Q&A for HTML parsing in java: Java HTML Parsing

Once you are able to parse HTML and get internal HTML text then you can encode HTML in one of the these ways: Is there a JDK class to do HTML encoding (but not URL encoding)?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM