简体   繁体   English

如何将特殊字符 'β' 转换为其 unicode

[英]How to convert Special Character 'β' to its unicode

I want to convert 'β' to its uni-code 'U+03B2' using code.我想使用代码将'β'转换为它的 uni-code 'U+03B2' But when i tried to convert it, I am getting '63' as its integer value which is the value of '?'但是当我尝试转换它时,我得到了'63'作为它的整数值,它是'?'的值。 character.特点。 It is not converting to its correct value.它没有转换为正确的值。 Is there any way to get the correct value of 'β' ie decimal '946' , hex '03B2' .有什么方法可以获得'β'的正确值,即十进制'946' ,十六进制'03B2'

I have tried:我试过了:

   int code = 'β';
   byte[] b = { (byte)code };
   String s = new String(b, "UTF-8");

Here is the value in various forms:以下是各种形式的值:

int code = 'β';
System.out.println(code);                                       // 946 as an int
System.out.println(Integer.toString(code));                     // 946 as a String
System.out.println(Integer.toHexString(code));                  // 3b2
System.out.println(String.format("%04x", code));                // 03b2
System.out.println(String.format("%04x", code).toUpperCase());  // 03B2

(Edit: Having seen the other answers I now know that you can use the format string "%04X" to get the answer in upper case form directly.) (编辑:看过其他答案后,我现在知道您可以使用格式字符串"%04X"直接以大写形式获取答案。)

If UTF-8 is not your platform default character encoding, you'll need to make sure that the source file is saved in UTF-8 encoding, and then specify the -encoding UTF-8 option when compiling (or another character encoding that supports β ).如果 UTF-8 不是您的平台默认字符编码,您需要确保源文件以 UTF-8 编码保存,然后在编译时指定 -encoding -encoding UTF-8选项(或其他支持的字符编码) β )。

Your code is wrong because you are taking a char , which is 16 bits, and chopping it in half, keeping only the lower 8 bits.您的代码是错误的,因为您正在使用一个 16 位的char并将其切成两半,只保留低 8 位。 Narrowing casts can destroy data;缩小强制转换可能会破坏数据; they are required to be written explicitly to make you think about what you are doing.他们必须明确地写出来,让你思考你在做什么。

Your code is like this:你的代码是这样的:

int code = 0x000003B2;
byte[] b = { 0xB2 };

The byte sequence 0xB2 isn't valid UTF-8, so it's decoded with the replacement character, (U+FFFD) in the string s .字节序列 0xB2 不是有效的 UTF-8,因此使用字符串s的替换字符 (U+FFFD) 对其进行解码。 If your output device isn't configured to display that character, it will be swapped with a different replacement character on output, ?如果您的输出设备未配置为显示该字符,它将在输出时与不同的替换字符交换, ? . .

If you get the encoding correct in your editor and compiler, this should work:如果您在编辑器和编译器中得到正确的编码,这应该可以工作:

int code = 'β';
System.out.printf("U+%04X%n", code);
String s = "β";
int i = s.codePointAt(0);
System.out.printf("U+%04X", i);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM