简体   繁体   中英

Building a Prefix Trie in C++, Suffix Trie

I have used the video to understand the prefix trie (though eventually am trying to get to the suffix trie in the end) however the link to the sample code is broken so I came up with this from the video, there are two functions ie insert and search as below

  void insert(string word)
{
    node* current=head;
    current->prefix_count++;
    for(unsigned int i=0;i<word.length();++i)
    {
        int letter=(int)word[i]-(int)'a';
        if (current->child[letter]==NULL)
            current->child[letter]=new node();
        current->child[letter]->prefix_count++;
        current=current->child[letter];
            }
    current->is_end=true;
}

bool search(string word)
{
    node *current=head;
    for(int i=0;i<word.length();++i)
    {
        if(current->child[((int)word[i]-(int)'a')]==NULL)
            return false;
        current=current->child[((int)word[i]-(int)'a')];
    }
    return current->is_end;
}

Then implemented the main as follows:

int main(){
node* head=NULL;

 string s="abbaa";
 init();
 insert(s);
 if(search("ab")==true) cout<<"Found"<<endl;
 else cout<<"Not found"<<endl;

}

And I am getting the following output: Not found

This is confusing since ab is found in the string s.

And lastly I am trying to understand this line :

int letter=(int)word[i]-(int)'a';

does this mean we are getting the ASCII code for 'a' and then subtract from the ASCII code of the current character?

Thank you

There are some difference between suffix and prefix trees.

Prefix tree - it's a tree, which contains all words (or some other chunks separated by some symbol) from a given text . Eg for text "you have a text", prefix tree contains 4 words: ["you", "have", "a", "text"] (but not "hav").

Suffix tree - it's a prefix tree , which contains all suffixes from a given word . Eg for string "abacaba", suffix tree contains 7 words: ["abacaba", "bacaba", "acaba", "caba", "aba", "ab", "a"].

Naive implementation of suffix tree is based on the prefix tree implementation which filled by all substrings of some input string in O(N^2) (so, in your code you should insert all suffixes of string S into Trie), but you can find more clever Ukkonen's algorithm which works in linear time.

Commonly, prefix tree used when you want to find word (eg from some dictionary, etc.) in the text; suffix tree used for finding some pattern as substring of the text.

So, you should choose which tree you need dependent on your problem.

And yes, you are right in your last question.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM