简体   繁体   中英

Counting the number of occurrences of each prefix using Knuth Morris

I am working on my competitive coding skills I came across an article for counting the number of occurrences of each prefix in a string. Here is the problem statement Given a string s of length n. In the first variation of the problem, we want to count the number of appearances of each prefix s[0…i] in the same string. In the second variation of the problem another string t is given and we want to count the number of appearances of each prefix s[0…i] in t. I found the solution to it.

Code:

for (int i = 0; i < n; i++)
    ans[pi[i]]++;
for (int i = n-1; i > 0; i--)
    ans[pi[i-1]] += ans[i];
for (int i = 0; i <= n; i++)
    ans[i]++;

I am not able to understand the problem statement completely as far as I know what prefix is:

For example: string: geekforgeek Has prefix as:{g,ge,gee,geek,geekf,geekfo,geekfor,geekforg,geekforge,geekforgee} as proper prefix. Can somebody help me what this question is trying to compute because only these are the prefix available which are occuring once. Thanks in advance.

If you have reached so far i am assuming that you know the prefix_function. so we consider string str = "ABACABA" we get its prefix array say pi as = {0, 0, 1, 0, 1, 2, 3} to store the occurance of all the proper prefix (ie we dont include the string itself) we create a new array or vector(acc. to you convinience) 'occ' of length str.length()+1 where occ[i] denotes the count of occurance of prefix str[0:i].

So as you quoted the code above, there are three loops. first of all i must explain what those loops are actually computing. First loop is straightforward it just computes the no of longest prefix which is also suffix in the string of length i. For prefix "A" we have same suffix as prefix for str[0:3] and str[0:5], if noticed carefully it can be said "A" is the longest prefix which is also a suffix in these both strings, Hence we get this from the array pi which we calculated above as it stores the length of longest prefix which is also a suffix. Similarly for prefix "AB" we have it as longest prefix and suffix in str[1:6], so on. we get occ = {3, 2, 1, 1, 0, 0, 0, 0}. I hope idea for first loop is clear.

Now when we consider the example above of prefix "A", if we observe "ABACABA", we see that in string str[0:7] we have "A" as a suffix too, but is not the longest where the longest prefix == suffix is "ABA" so in our first loop we missed this occurance of the prefix. Also assume if we have prefix of length l which is also a suffix and ends at index i, and to get some prefix of length l' < l we go for pi[pi[i]-1] or say pi[l-1] as is clear from prefix function. Hence this way using array pi we trace prefixes of length less that were calculated in first loop. If we know that the length prefix i appears exactly occ[i] times, then this number must be added to the number of occurrences of its longest suffix that is also a prefix.

for the third loop is we add the occurance of each prefix. while in other two we just consider the suffix.

the code should be

 {  for (int i = 0; i < n; i++)
    ans[pi[i]]++;
for (int i = n-1; i > 0; i--)
    ans[pi[i]-1] += ans[i];     ///note the index
for (int i = 0; i <= n; i++)
    ans[i]++;
}

and the article is https://cp-algorithms.web.app/string/prefix-function.html#counting-the-number-of-occurrences-of-each-prefix where that index is wrongly mentioned. please correct me if i am wrong.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM