[英]Want to replace special characters with equivalent UTF-8 symbols
As part of my application I have written a custom method to extract data from the DB and return it as a string. 作为我的应用程序的一部分,我编写了一个自定义方法来从数据库中提取数据并将其作为字符串返回。 My string has special characters like the pound sign, which when extracted looks like this:
我的字符串有特殊字符,如井号,提取时看起来像这样:
"MyMobile Blue £54.99 [12 month term]"
“MyMobile Blue£ 54.99 [12个月期限]”
I want the £ 我想要£ to be replaced with actual pound symbol.
用实际的英镑符号代替。 Below is my method:
以下是我的方法:
public String getOfferName(String offerId) {
log(Level.DEBUG, "Entered getSupOfferName");
OfferClient client = (OfferClient) ApplicationContext
.get(OfferClient.class);
OfferObject offerElement = getOfferElement(client, offerId);
if (offerElement == null) {
return "";
} else {
return offerElement.getDisplayValue();
}
}
Can some one help on this? 有人可以帮忙吗?
The document contains XML/HTML entities . 该文档包含XML / HTML实体 。
You can use the StringEscapeUtils.unescapeXml()
method from commons-lang to parse these back to their unicode equivalents. 您可以使用commons-lang中的
StringEscapeUtils.unescapeXml()
方法将这些方法解析为它们的unicode等效项。
If this is HTML rather than XML use the other methods as there are differences in the two sets of entities. 如果这是HTML而不是XML,则使用其他方法,因为两组实体存在差异。
I voted for StringEscapeUtils.unescapeXml() solution. 我投票支持StringEscapeUtils.unescapeXml()解决方案。 Anyway, here's is a custom solution
无论如何,这是一个自定义解决方案
String s = "MyMobile Blue £54.99 [12 month term]";
Pattern p = Pattern.compile("&#(\\d+?);");
Matcher m = p.matcher(s);
StringBuffer sb = new StringBuffer();
while(m.find()) {
int c = Integer.parseInt(m.group(1));
m.appendReplacement(sb, "" + (char)c);
}
m.appendTail(sb);
System.out.println(sb);
output 产量
MyMobile Blue £54.99 [12 month term]
note that it does not accept hex entity reference 请注意,它不接受十六进制实体引用
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.