简体   繁体   English

从编码的Unicode字符串转换为Java字符串

[英]Convert from encoded unicode String into Java String

I have a string in json data which looks like this: 我在json数据中有一个字符串,看起来像这样:

#0023Sat Apr 30 10:46:11 UTC 2016#000a[Interoperability]Interoperability#005c Index=Unknown (R03)#000a[Exif]Shutter#005c Speed#005c Value=1/1999 sec#000a[Exif]Bits#005c Per#005c Sample=8 8 8 bits/component/pixel#000a[Exif]Exposure#005c Bias#005c Value=0 EV#000a[Exif]Sub-Sec#005c Time#005c Original=00#000a

All those #XXXX words are unicode. 所有这些#XXXX词都是unicode。

How do I convert this into a Java String? 如何将其转换为Java String?

Pattern p = Pattern.compile("#([0-9A-Fa-f]{4})");
Matcher m = p.matcher(s);
StringBuffer sb = new StringBuffer();
while (m.find()) {
    int c = Integer.parseInt(m.group(1), 16);
    m.appendReplacement(sb, String.valueOf((char) c));
}
m.appendTail(sb);
return sb.toString();

This assumes that #XXXX encodes a UTF-16 Unicode code point. 假定#XXXX编码UTF-16 Unicode代码点。 Unicode code points actually supercede the 16 bit range of #XXXX. Unicode代码点实际上取代了#XXXX的16位范围。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM