简体   繁体   中英

Int to char conversion, wrong number

I have the following code:

int a=-12;
char b=(char) a;
System.out.println(b);
System.out.println(b+0);

It first prints out some empty character and then a number 65524. If I change a to, say, 16 the displayed number becomes 65520. If a is -2 , the number is 65534.

If the number is positive and small enough it prints out characters out of Unicode table and returns the character's number (which is the same as a) if everything is OK and that previous strange number from above if it's not allright (if a is too big).

For instance, for a = 8451 it returns ℃ (Degree Celsius) character and a itself (8451), but if a=84510 it returns some strange Chinese symbol and a different from a number (18974). If a is even bigger a=845100 it returns empty sumbol and 58668.

The question is, where do those numbers come from?

I've tried to find the answer, but wasn't lucky so far.

EDIT Since int to char is a narrowing conversion, please consider the same question with byte a. Too large numbers are obviously impossible now, but I wonder what happens with negatives - converting byte a = -x to char gives the same weird numbers as with int a .

int is a signed number which has 4 bytes and 32 bits . The Two's complement representation of -12 is

11111111 11111111 11111111 11110011 .

char is a unsigned which has 2 bytes and 16 bits .

int a=-12;
char b=(char)a;

Two's complement representation of b is

11111111 11110011

which is equivalent to 65524

I think it's because char in Java is an unsigned " Double Byte " integer. When Byte is 8-bits, double-byte is 16-bits = 2 power by 16 = 65536 And you get Two's complement ( Binary subtraction operation).

Because the number is unsigned all 16 bits are used to represent the integer, so when you give a negative number it creates an overflow you get the number which is (65536 + a), for example:

When int a = -16; you get 65536 - 16 = 65520 (in binary: 1111 1111 1111 0000)

When int a = -2; you get 65536 - 2 = 65534 (in binary: 1111 1111 1111 1110)

When int a = 84510; you exceed the limit of 65536 for char , so you are left with 18974 (84510 - 65536 = 18974).

You get a character from the Unicode table, I guess because it's the character set or code page you defined.

When you cast you should pay attention to the range of values of the data types you cast, in this case, the difference between int and char .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM