简体   繁体   中英

I used a char variable as Array Index to a Boolean array in Java? How does the conversion of character to its ASCII value happen?

The below code checks if there is a duplicate character

String s = "Bengaluru";
boolean[] characters = new boolean[128];

    for(int i=0; i<s.length();i++){
        char ch = s.charAt(i);          
        if(characters[ch] == true){
            return;
        }
        else
            characters[ch] = true;//Here true is getting stored in the ASCII value of the character. 
    }

The full answer is much more complicated than what dasblinkenlight is suggesting.

Since Java 5, the data type char does not represent a character or Unicode codepoint anymore, but a UTF-16 encoded value, which might be a complete character or a fraction of a character. This UTF-16 value is in reality just a 16-bit unsigned integer in the range 0 to 65535 and will be casted automatically to an int when used as an array index, just as the other numeric datatypes like short or byte. If you really want a Unicode codepoint as a character, you should use the method codePointAt(int index) instead of charAt(int index) . The Unicode code point can be in the range 0 to 1114111 (0x10ffff).

How the methods charAt and codePointAt methods work internally is implementation specific. It is often incorrectly claimed that a String is just a wrapper around an array of char s, but the internal implementation of the String class is not mandated by the language or API specification. Since Java 6, the Oracle VM has been using different optimization strategies to save memory and is not always using a plain char array.

Java represents char s using 16-bit UNICODE code points * . There is no conversion to ASCII happening - it's just that the initial 128 code points happen to represent the same characters as the corresponding ASCII values.

Java does perform a conversion of char to int in order to make the indexing possible. This is a built-in conversion that happens implicitly, because it is widening . In other words, any value that can be stored in a char can be represented in an int without a loss.

* Java-5 switched to UTF-16 representation, changing interpretation of some numbers as "partial characters". char s remained 16-bit unsigned numbers, though.

Java Supports Automatic widening primitive conversions

https://docs.oracle.com/javase/specs/jls/se8/html/jls-5.html#jls-5.1.2

How to stop Java from automatically casting a char value to an int?

char to int, long, float, or double

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM