简体   繁体   English

什么是用Java编码HTML实体的轻量级库?

[英]What is a lightweight library for encoding HTML entities in Java?

What I require is a Java method or lightweight library which will encode special characters into HTML entities. 我需要的是Java方法或轻量级库,它将特殊字符编码为HTML实体。 So & becomes & 因此&成为& " becomes " £ becomes £ etc. “变成"£变成£等。

I say "lightweight" because all my current searching has found is the Apache Commons Lang StringEscapeUtils class, which does the job perfectly, but increases my program size from 50Kb to 350Kb. 我之所以说“轻量级”,是因为我当前搜索到的所有内容都是Apache Commons Lang StringEscapeUtils类,该类可以很好地完成工作,但是将程序大小从50Kb增加到350Kb。

The Apache Commons Lang library is perfect, apart from the size. 除了大小以外,Apache Commons Lang库非常完美。 So if there was a way of reducing the size (or extracting the method they use for encoding) that would be great. 因此,如果有一种减小大小的方法(或提取它们用于编码的方法),那将是很好的。 Otherwise, if someone has another method or library which does the same thing, it would be greatly appreciated. 否则,如果有人拥有执行相同操作的其他方法或库,将不胜感激。

Do you deploy on a phone? 您是否在手机上部署? Else, 300 KB is nothing. 否则,300 KB就什么都不是了。

Anyway, the special chars to encode are not many: < , > , & , " , ' . All the other characters don't need escaping if you use an encoding able to handle all the characters, like UTF-8. So building such a method yourself should be very easy. 无论如何,要编码的特殊字符并不多: <>&"' 。如果您使用能够处理所有字符的编码(例如UTF-8),则所有其他字符都不需要转义。自己的方法应该很简单。

尝试获取该库的源代码(StringEscapeUtils),并使用必要的源代码,而不是全部。

If you are content with named entities for <. 如果您对<的命名实体感到满意。 >. >。 &, " and ' and numeric entities (like &#12345; ) for chars > 127, then java already knows to convert them. JTextPane handles HTML as such, as it is encoding unaware. &,“和”以及数字实体(例如&#12345; ),如果chars> 127,则Java已经知道要对其进行转换。JTextPane照这样处理HTML,因为它对编码没有意识。

// Minimum overhead:
JTextPane tp = new JTextPane();
tp.setContentType("text/html");
tp.setText(html); // read?
String htmlWithEntities = tp.getText(); // Does this work?

Better would be using HTMLEditorKit and creating an HTMLDocument. 最好使用HTMLEditorKit并创建一个HTMLDocument。

If you would like to avoid having a document object model, you could easily do it yourself. 如果您希望避免使用文档对象模型,则可以轻松地自己完成。 See JB Nizet. 参见JB Nizet。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM