简体   繁体   English

包含中文字符到ASCII的字符串

[英]String containing Chinese character to ASCII

This is my code, both of the two ways give the same output. 这是我的代码,两种方式都提供相同的输出。

String description = "test string with 音樂";
byte[] b = description.getBytes("US-ASCII");

//first way
char[] result = new char[b.length];       
for ( int i = 0; i < b.length; i++ ) {
    result[i] = (char)b[i];
}
System.out.println(new String(result)); //output - test string with ??

//second way
System.out.println(new String(b, "UTF-8")); //output - test string with ??

I am using Eclipse and changed console output encoding to US-ASCII under Run Configuration 我正在使用Eclipse,并在“运行配置”下将控制台输出编码更改为US-ASCII

Is it possible to get it as US-ASCII encoding string? 是否可以将其作为US-ASCII编码字符串?

Thanks adv!!! 谢谢副词!

It's not possible to convert it to US-ASCII but, 无法将其转换为US-ASCII,但是,

If you just want Unicode escaped string then you can use apache common lang utility, 如果您只想使用Unicode转义字符串,则可以使用apache公共lang实用程序,

import org.apache.commons.lang.StringEscapeUtils;

...
StringEscapeUtils.unescapeJava("test string with \u97F3\u6A02"); 
 //gives result : test string with 音樂
StringEscapeUtils.escapeJava("test string with 音樂"); 
 //gives result : test string with \u97F3\u6A02

It is not possible to convert Chinese characters to US-ASCII because they are not contained in this character set. 无法将中文字符转换为US-ASCII,因为它们不包含在此字符集中。

US-ASCII knows only 128 different characters and some of them are even non-printing control characters. US-ASCII仅知道128个不同的字符,其中一些甚至是非打印控制字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM