简体   繁体   English

如何将这些UTF-8文字转换为字符串?

[英]How can I convert these UTF-8 literals into character strings?

I have UTF-8 literals like this: 我有这样的UTF-8文字:

String literal = "\x6c\x69b/\x62\x2f\x6d\x69nd/m\x61x\x2e\x70h\x70";

I need to read them and convert them into plain text. 我需要阅读它们并将其转换为纯文本。

Is there an import in java that can interpret these? java中是否有可以解释这些内容的导入?

Thank you. 谢谢。

Java doesn't support UTF-8 literals per se. Java本身不支持UTF-8文字。 Java's linguistic support for Unicode is limited to UTF-16 based Unicode escapes. Java对Unicode的语言支持仅限于基于UTF-16的Unicode转义。

You can express your UTF-8 characters in a String literal with Unicode escapes as follows: 您可以使用Unicode转义以字符串文字形式表示UTF-8字符,如下所示:

String literal = 
    "\u006c\u0069b/\u0062\u002f\u006d\u0069nd/m\u0061x\u002e\u0070h\u0070";

(Assuming no typing errors ...) (假设没有输入错误...)

or you could (in this case) replace the escapes with normal ASCII characters. 或者(在这种情况下)您可以将转义符替换为普通的ASCII字符。

Note that the conversion from UTF-8 to UTF16 is not normally that simple. 请注意,从UTF-8到UTF16的转换通常不是那么简单。 (It is simple in this case because the \\xnn characters are all less than 0x80, and therefore each one represents a single Unicode code point / unit.) (在这种情况下很简单,因为\\ xnn字符都小于0x80,因此每个字符代表一个Unicode代码点/单位。)


Another approach is to represent the UTF-8 as an array of bytes, and convert that to a String; 另一种方法是将UTF-8表示为字节数组,然后将其转换为String。 eg 例如

byte[] bytes = new byte[]{
    0x6c, 0x69, 'b', '/', 0x62, 0x2f, 0x6d, 0x69, 'n', 'd', 
    '/', 'm', 0x61, 'x', 0x2e, 0x70, 'h', 0x70};
String str = new String(bytes, "UTF-8");

(Again, assuming no typing errors.) (再次,假设没有键入错误。)

If you have the characters in a file to be read, you can use InputStreamReader to convert from whatever charset the string is in to a sequence of char : 如果文件中有要读取的字符,则可以使用InputStreamReader将字符串所在的任何字符集转换为char序列:

InputStream is = ...; // get the input stream however you want
InputStreamReader isr = new InputStreamReader(is, "charset-name");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM