简体   繁体   中英

How to convert from HTML to UTF-8 in java

I have an ASCII String, with HTML entities, like:

 à
 ¨
 ç

I need this String to be without those entities and convert them into UTF-8 chars. Is there any easy way, in java to do that?

Where:

 Clazz.method("aà","UTF-8")

returns "aà"

or something like that?

Take a look at org.apache.commons.lang.StringEscapeUtils.unescapeHtml(...) . Apparently it understands all character entities defined in HTML 4.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM