Why do we use string.charAt(index)-'a' in java?

Question

public static void main(String[] args) throws IOException {
    BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
    String s = br.readLine();
    int[] arr = new int[26];
    for(int i=0;i<s.length();i++)
        arr[s.charAt(i)-'a']++;
    
    int odds = 0;
    for(int i=0;i<26;i++)
        if(arr[i]%2!=0)
            odds++;
    
    if(odds%2==1 || odds==0)
        System.out.println("First");
    else
        System.out.println("Second");

}

I saw this snippet of code and found this part confusing. So could you please tell me why do we use this and what is the significance of 'a' in arr[s.charAt(i)-'a']++; ?

Answer 1

This code makes a histogram-like counter for each letter of the alphabet. Try printing a char such as 'a' as follows:

System.out.println((int)'a'); // Output: 97

Each char has a corresponding Unicode value between 0 and 65,535. Subtracting 'a' (or, 97) scales each letter in the alphabet to the 0-26 range that corresponds to the "buckets" in the arr array. Here's an example:

System.out.println('z' - 'a'); // Output: 25 (the last bucket in the array)
System.out.println('a' - 'a'); // Output: 0 (the first bucket in the array)

The second loop in the code checks the parity of each count to determine which are odd. Lastly, the final print conditional checks if the total number of letters with an odd number of occurrences. If this total is 0 or itself odd, print "First" , else "Second" .

Try this code with any character outside of a to z or with a capital letter. It'll crash because the ASCII representation of the character is out of the array's size and you'll wind up with an IndexOutOfBoundsException .

Here's a sample program showing how the histogram is built and converts its output back to letters through addition:

class Main {
    public static void main(String[] args) {
        String s = "snuffleupagus";
        int[] arr = new int[26];

        for (int i = 0; i < s.length(); i++) {
            arr[s.charAt(i)-'a']++;
        }

        for (int i = 0; i < arr.length; i++) {
            System.out.println((char)(i + 'a') + ": " + arr[i]);
        }
    }
}

Output:

a: 1
b: 0
c: 0
d: 0
e: 1
f: 2
g: 1
h: 0
i: 0
j: 0
k: 0
l: 1
m: 0
n: 1
o: 0
p: 1
q: 0
r: 0
s: 2
t: 0
u: 3
v: 0
w: 0
x: 0
y: 0
z: 0

Answer 2

arr is made of an int array of size 26, which is also the number of letters in English alphabet. All that loop is doing is counting the frequency of the letters, represented via their index in the alphabet, arr[0] being 'a' , arr[1] being 'b' , etc.

The technicalities of it can be explained simply. s.charAt(i) is returning a char instance at the specified position i . A char can also be represented as a byte in Java. The subtraction then takes the ASCII value (represented as a byte ) of 'a' from the current character at i . So what you end up getting is 'a' - 'a' == 0 , 'b' - 'a' == 1 , and so on.

Please note that this is probably not the best way to count characters as a string can contain more than just the lowercase letters, eg uppercase letters, and many more symbols.

Why do we use string.charAt(index)-'a' in java?

Question

2 answers

solution1
4 2018-08-31 20:44:14

solution2
2 2018-08-31 20:40:13

Why do we use string.charAt(index)-'a' in java?

Question

2 answers

solution1 4 2018-08-31 20:44:14

solution2 2 2018-08-31 20:40:13

solution1
4 2018-08-31 20:44:14

solution2
2 2018-08-31 20:40:13