简体   繁体   中英

Trie tree struct declaration

so I've this piece of code (which is not mine) and I can't understand for the life of me what those structures look like. Can someone explain please?

typedef struct trie_node trie_node_t;
struct trie_node
{
    int value;
    trie_node_t *children[ALPHABET_SIZE];
};

// trie ADT
typedef struct trie trie_t;
struct trie
{
    trie_node_t *root;
    int count;
};

Int count in second struct is for counting all words put in the tree, but I would like to know how many times every single word was put in there, and apart from modifying rest of the code, how should I modify the structure to achieve that?

Rest of the code: http://pastebin.com/9zQuCBjb

I suppose you are familiar with the concept of a trie, where you find words and prefixes of words by walking (or crawling, to use the words of the linked code) down the tree with the letters of the word and branching at each node according to the letters you find. Each node has many children; 26 if you use the case-insensitive Latin alphabet.

The word is encoded in the path on which you get there:

root->[f]->[i]->[s]->[h]  --> "fish"

Now you need to know whether the current node represents a word. "fish" is a word, but "fis" isn't. You can't use the fact that the node is a leaf without children, because "fishbone" might be in the dictionary. That's what the value entry is for: Zero means the current node does not represent a word, otherwise the value is a one-based index of the current word.

When you create a new entry, you just crawl down the trie, possibly creating new nodes as you go and marking the last node with the current count fo words as value. If "fishbode" is already in the trie and you add "fish" , you don't create new nodes and only mark the "h" node with a new value.

The trie struct is just a helper to contain the trie's root node and a count.

If you want to keep track of occurrences, add a count field to the nodes and increment it whenever you set value . (The original code doesn't check whether a value is already in the trie before and adds words unconditionally, thereby overwriting any old values.)

You can also keep a count of all words starting with the prefix at the current node by having an prefix_count field and incrementing that whenever you pass a node when inserting a key.

When you want to retrieve the occurrencs, you'll have to walk all subtrees.

Tries are useful for autoexpansion of words from the first letters of user input or T9-style typing systems, but they are rather memory greedy. If you just want to count the occurrences of words (without making use of the benefits of a trie), it might be easier to achieve this with a single hash map of word to count.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM