简体   繁体   中英

Getting indices of patterns found in a sequence using KNUTH-MORRIS-PRATT?

I am trying to find patterns in a sequence of integers. I have found the KNUTH-MORRIS-PRATT (KMP) in this link .

I've fed the function a 'pattern' to find in a 'text.' But the output of the KMP function is an object. I need the indices for the instances of the pattern in the text. I tried checking out the attributes of the object by typing dot and pressing tab but nothing is there. How can I get the indices?

Edit

Code:

> # Knuth-Morris-Pratt string matching
> # David Eppstein, UC Irvine, 1 Mar 2002
> 
> from __future__ import generators
> 
> def KnuthMorrisPratt(text, pattern):
> 
>     '''Yields all starting positions of copies of the pattern in the text. Calling conventions are similar to string.find, but its
> arguments can be lists or iterators, not just strings, it returns all
> matches, not just the first one, and it does not need the whole text
> in memory at once. Whenever it yields, it will have read the text
> exactly up to and including the match that caused the yield.'''
> 
>     # allow indexing into pattern and protect against change during yield
>     pattern = list(pattern)
> 
>     # build table of shift amounts
>     shifts = [1] * (len(pattern) + 1)
>     shift = 1
>     for pos in range(len(pattern)):
>         while shift <= pos and pattern[pos] != pattern[pos-shift]:
>             shift += shifts[pos-shift]
>         shifts[pos+1] = shift
> 
>     # do the actual search
>     startPos = 0
>     matchLen = 0
>     for c in text:
>         while matchLen == len(pattern) or \
>               matchLen >= 0 and pattern[matchLen] != c:
>             startPos += shifts[matchLen]
>             matchLen -= shifts[matchLen]
>         matchLen += 1
>         if matchLen == len(pattern):
>             yield startPos

Sample Text: [1, 2, 2, 3, 3, 2, 4, 5, 2, 2, 3, 2]
Sample Pattern: [2, 2, 3]

Sample output: [1, 8] 

You aren't returning anything from the function and you need to loop through the iterator to get the indices by using comprehension . Rewrite it this way:

from __future__ import generators

def KnuthMorrisPratt(text, pattern):

    pattern = list(pattern)

    # build table of shift amounts
    shifts = [1] * (len(pattern) + 1)
    shift = 1
    for pos in range(len(pattern)):
        while shift <= pos and pattern[pos] != pattern[pos-shift]:
            shift += shifts[pos-shift]
        shifts[pos+1] = shift

    # do the actual search
    startPos = 0
    matchLen = 0
    for c in text:        
        while matchLen == len(pattern) or matchLen >= 0 and pattern[matchLen] != c:
            startPos += shifts[matchLen]
            matchLen -= shifts[matchLen]
        matchLen += 1
        if matchLen == len(pattern):
            yield startPos

    return matchLen

t= [1, 2, 2, 3, 3, 2, 4, 5, 2, 2, 3, 2]
p= [2, 2, 3]
[k for k in KnuthMorrisPratt(t,p)] 

[1, 8]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM