简体   繁体   中英

Time complexity of LPS calculation of KMP

While everywhere it is mentioned that we backtrack only the incremented amount in the inner loop while calculating LPS for KMP, it is not clear why the overall complexity is O(length(pat)).

KMP maintains two indexes:

k — since you have a match of your pattern i — the latest symbol in Text that currently matched.

The first part is very simple, you just have to compare the symbol to you text, if its ok increment i

If they don't we use precalced prefix function to shorten current matched pattern and try to match same x again on shorter version. And so on, until we have a match and ++i, or until k reaches i and we have a brand new start.

Worst case is, you'll have k and i go fully through Text which gives 2 * len(T) steps.

So the complexity is O(T + P) all the time. We don't depend on length of prefix when actually look for a match. Which means, if you do KMP with multiple matches of one pattern you still get O(T + P)

Looks like I figured it out. The code looks like this:

while (j < len1) {
        if (needle[i] == needle[j]) {
            tab[j] = i+1;
            j++;
            i++;
        }
        else {
            if (i == 0) {
                tab[j] = 0;
                j++;
            }
            else
                i = tab[i-1];
        }
    }

So basically we never decrement j, in some of the iteration (else->else) we do not increment j and i is moved back till we reach 0. This backward movement can be as long as j moved. So if j moved n step, we can not increment j for maximum of n iterations. That makes the total iterations as n+n=2n Hence the complexity is O(n).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM