简体   繁体   中英

Recursive combination string searching in Python

I am trying to write an algorithm that takes a string a and a longer string b as arguments, and returns all possible ordered combinations of indices corresponding to the letters in b . (I admit, this is a poor definition of the problem. Not quite sure how to word it. Hopefully, the example below will clarify what I mean.)

Here are some assumptions about the input arguments.

  1. All letters in a and b are capitalized.
  2. len( a ) < len( b )
  3. Only letters that exist in a will appear in b . ie set( a ) == set( b )
  4. Duplicate letters are allowed in both a and b .

Example:

If a = "SLSQ" and b = "SQLSSQLSQ", then the result would look like:

result = [
[0,2,3,5],
[0,2,3,8],
[0,2,4,5],
[0,2,4,8],
[0,2,7,8],
[0,6,7,8],
[3,6,7,8],
[4,6,7,8]]

Another way of looking at it; I wrote out explicitly what the results of a recursive algorithm would look like for the example above. The numbers are the indices to the letters of b .

0123456789
SQLSSQLSQS      SLSQ
S LS Q      ->  0235
S LS    Q   ->  0238
S L SQ      ->  0245
S L S   Q   ->  0248
S L    SQ   ->  0278
S     LSQ   ->  0678
   S  LSQ   ->  3678
    S LSQ   ->  4678

I am fairly certain I could write a brute force algorithm to solve this problem, but what I really want is a clean tractable pythonic recursive algorithm. Unfortunately, my recursion coding skills aren't that impressive. This is what I have so far:

def recurse(a_str, b_str, res):

    if len(a_str) == 0:
        return _, _, res
    for token in b_str:
        if token == a_str[0]:
            _ = a_str[0]
            _, _, res = recurse(a_str[1:], b_str, res)
        else:
            _, _, res = recurse(a_str, b_str[1:], res)
    return _, _, res

The "_" are just placeholders until I can figure out what to do next. My brain hurts. Any suggestions would be appreciated greatly.

Here is a recursive version tracking indexes of a and b as ai and bi

def recurse(a_str, b_str, ai=0, bi=0):
    if not a_str:
        return
    if ai < len(a_str):
        b_lim = len(b_str) - len(a_str) + ai + 1
        for i in range(bi, b_lim):
            if a_str[ai] == b_str[i]:
                for r in recurse(a_str, b_str, ai+1, i+1):
                    yield (i,) + r
    else:
        yield ()

list(recurse(a, b))
[(0, 2, 3, 5),
 (0, 2, 3, 8),
 (0, 2, 4, 5),
 (0, 2, 4, 8),
 (0, 2, 7, 8),
 (0, 6, 7, 8),
 (3, 6, 7, 8),
 (4, 6, 7, 8)]

Combinations from itertools will help you do this easily. so you don't need to write a manual recursive function for it.

a = "SLSQ"
b = "SQLSSQLSQ"
B = zip(b, xrange(0,len(b)))
from itertools import combinations
res = []
for i in combinations(B, 4):
    bstr = "".join(map(lambda x:x[0], i))
    if a.__contains__(bstr):
        res.append(map(lambda x:x[1], i))

for i in res:
    print i

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM