HTML特殊字符解析

Question

I'm looking for a java class to parse all HTML special characters. 我正在寻找一个Java类来解析所有HTML特殊字符。 I guess it's a common problem but i cannot find a fast solution right now. 我想这是一个普遍的问题，但是我现在找不到快速的解决方案。

What i wanto to get is: 我想要得到的是：

input: th&egrave; --> output: thè
input: &#187;
input: &lraquo;
...

Do you know anything useful for me? 你知道对我有用的吗？

Answer 1

Have you googled on it? 你用谷歌搜索吗？ The first link on "java HTML markup entity parser" refers to html text extractor “ java HTML标记实体解析器”上的第一个链接引用html文本提取器

It seems to be what you need. 这似乎是您所需要的。

Also, you may want to examine javax.swing.JLabel's (and another swing text components') renderers. 另外，您可能需要检查javax.swing.JLabel（和另一个swing文本组件）的渲染器。

Answer 2

Try the StringEscapeUtils utility class. 尝试使用StringEscapeUtils实用程序类。 Check the docs for the StringEscapeUtils.unescapeHtml() method. 检查文档中的StringEscapeUtils.unescapeHtml（）方法。

Docs here: 此处的文档：

http://commons.apache.org/lang/api-release/org/apache/commons/lang/StringEscapeUtils.html http://commons.apache.org/lang/api-release/org/apache/commons/lang/StringEscapeUtils.html

Download here: 在这里下载：

http://commons.apache.org/lang/ http://commons.apache.org/lang/

HTML特殊字符解析

问题描述

2 个解决方案

解决方案1
0 2010-11-02 13:36:04

解决方案2
0 已采纳 2010-11-02 14:07:05

HTML特殊字符解析

问题描述

2 个解决方案

解决方案1 0 2010-11-02 13:36:04

解决方案2 0 已采纳 2010-11-02 14:07:05

解决方案1
0 2010-11-02 13:36:04

解决方案2
0 已采纳 2010-11-02 14:07:05