简体   繁体   中英

python re.findall() with list

I have a long string (eg AAAABBBBCCCC) and I eventually want to find all overlapping occurrences for each member of a list of different substrings (eg ['AAA', 'AAB', 'ABB', 'BBB']).

I found a very helpful suggestion on a previous StackOverflow posting - string count with overlapping occurrences However, using this I can't seem to assign the substrings in such a way that re.findall() can recognize them. It's probably something stupid, but I just can't seem to figure it out. It seems like the ? is doing something different than usual...

>>> string = 'AAAABBBBCCCC'
>>> len(re.findall('(?=AAA)', string))
2
>>> substring = 'AAA'
>>> len(re.findall('(?=substring)', string))
0
>>> substring = "'(?=AAA)'"
>>> len(re.findall(substring, string))
0
>>> #This works, but is not overlapping:
>>> substring = 'AAA'
>>> len(re.findall(substring, string))
1

I would appreciate any suggestions! Thanks!

If I understood you correctly, you want to assign a variable and use it in the findall function?

>>> substring = '(?=AAA)' #or "(?=AAA)"
>>> len(re.findall(substring, string))
>>> 2

See if this helps you with the rest, your 5th line is string substring not variable sub string.

import re
string = 'AAAABBBBCCCC'
len(re.findall('(?=AAA)', string))
2
substring = 'AAA'
len(re.findall('(?=' + substring + ')', string))
2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM