简体   繁体   中英

why ch1 == ch2 is false, doesn't it hold the same char values?

I'm trying to compare two char primitives ch1 and ch2. Both are assigned the value 1 as shown below.

But when compared using the "==" operator it returns false, which I don't understand how or what's happening behind the scenes.

char ch1 = (char)1;
char ch2 = '1';
System.out.println(ch1==ch2); //false

//further comparisions
System.out.println(ch1 == 1);       //true
System.out.println(ch1 == '\u0031'); //false

System.out.println(ch2 == 1);       //false
System.out.println(ch2 == '\u0031'); //true

'1' has the value 49 (31 hexadecimal).

(char)1 has the value 1.

A char is just a 16-bit integer. The notation 'x' means 'the character code for the character x', where the encoding used in Java is Unicode, specifically UTF-16.

The cast (char) does not change the value of the expression to its right, except that it truncates it from a full-size integer to 16 bits (which is no change for values 0 to 65535).

Basically what you are doing is casting the number one as a char, so ch1 is now equals to unicode character 1 (SOH or Start of Header)

So when you compare ch1 (SOH) to ch2 ('1') its going to return false As well if you compare ch1 (SOH - ) to `'1' - 1 is going to return false

That's the main reason why is returning false, the unicode value of ch1 that you expect is different from the one you assigned

Code point

The char type is essentially broken since Java 2, physically incapable of representing most characters.

Instead use code point integer numbers. Every character is permanently assigned a specific number, a code point.

int codePoint = "1".codePointAt( 0 ) ;  // Annoying zero-based index counting. 

The result is 49 decimal, 31 hexadecimal.

Make a string of that single character per the code point.

String s = Character.toString( codePoint ) ;

Or more specifically:

String latinDigitOneCharacter = Character.toString( 49 ) ;

As others pointed out, your code was mistakenly comparing the character defined as the Latin digit “1” with a code point of 1.

The character assigned to the code point of one is the control code SOH, Start of Heading. This is true in both Unicode and US-ASCII (Unicode is a superset of US-ASCII).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM