简体   繁体   中英

What is happening in the following code of a trie

I was going over this example of a Trie and I am having difficulty understanding the following code

void Trie::addWord(string word)
{
    Node * currentNode = root;

    for (int i = 0; i < word.size(); ++i)
    {
        char currentChar = tolower(word.at(i));
        int index = currentChar - 'a';
        assert(index >= 0);     // Makes sure the character is between a-z
        if (currentNode->children[index] != NULL)
        {
            // check if the current node has the current character as one of its decendants
            currentNode = currentNode->children[index];
        }
        else
        {
            // the current node doesn't have the current character as one of its decendants
            Node * newNode = new Node(currentChar);
            currentNode->children[index] = newNode;
            currentNode = newNode;
        }
        if (i == word.size() - 1)
        {
            // the last character of the word has been reached
            currentNode->end = true;
        }
    }
}

My question is why is a being subtracted here

 int index = currentChar - 'a';

At the line int index = currentChar - 'a'; The currentChar (what ever it is) will be subtracted by 'a' character which has the ASCII value 97.
In this case you have two conditions:

  1. First If currentChar is between az, the result of index will always be >= 0 .

  2. Otherwise currentChar is not between az because index will be negative number, currentChar can't be between AZ because of the tolower() function.

You can refeare to this link to know more about ASCII values

Also you need to update the condition assert(index >= 0 && index < 26) because { ,} ,| and ~ will make index >=0

It will give you the letter position in alphabeta consider a as the position 0 . Here a is converted to its ASCII code and it is subsctracted from the az letter became more easy to handle.

What people tend to forget - a character is a byte with the ascii value in it.

so 'a' is 97 in the ascii table (I think), so all is being done - is currentChar-97.

please follow the http://www.asciitable.com for the actual values.

but always remember - a char is a byte, and you can use this fact a lot!
I am using it for skipping memory addresses in pointers (for example:

void* a = (void*)((char*)arr+2);

This code will return the memory of arr+ two bytes, while ignoring the type that is being held within arr)

Please do notice - |'a'-'A'| = |'A'-'a'| , but one will be negative, while the other one will be positive

int index = currentChar - 'a'; this is to check the character because the user can enter a character that is not between a to z.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM