简体   繁体   中英

How to create a naive exact matching code that allows at least 2 mismatch?

For example, ACTTTA occurs twice in ACTTACTTGATAAAGT, once at offset 0 with 2 mismatches, and once at offset 4 with 1 mismatch. So naive_2mm('ACTTTA', 'ACTTACTTGATAAAGT') should return the list |[0, 4].

I'm still a newbie. I've been working on this problem for almost a week now and I can't figure this out on my own. This is the code that I develop. Can someone explain why this code does not work and how can I do this?

def naive_2mm(p,t):
occurences = []
counter = 0
for i in range(len(t)-len(p)+1):
    while counter != (len(p)-2):
      for j in range(len(p)):
        if t[i+j] == p[j]:
          counter += 1
          continue
else:
   occurences.append(i)
return occurences

Your code is pretty close. I think the counter initialization is just about the only issue. This works:

def naive_2mm(p,t):
    occurrences = []
    for i in range(len(t)-len(p)+1):
        counter = 0
        for j,k in zip(p, t[i:i+len(p)]):
            counter += j==k
        if counter >= len(p)-2:
            occurrences.append(i)
    return occurrences

needle = 'ACTTTA' 
haystack = 'ACTTACTTGATAAAGT'

print(naive_2mm(needle,haystack))

Or, for those of you who like one-liners, you can replace the 3-line sequence creating counter with:

        counter = sum(j==k for j,k in zip(p, t[i:i+len(p)]))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM