I need a flexible way to search for a pattern in a string.
Say our pattern is 'GEEGG'.
I want to determine a string has this pattern, allowing 'interrupting' and 'flanking' symbols.
• Interrupting symbol for 'GEEGG' = 'GEGEGG' or 'GEEGEG' • Flanking symbol for 'GEEGG' = 'GGEEGG' or 'GEEGGE'
I cannot thing of a simple/elegant way to approach this problem.
All of the following queries
should match the pattern
pattern = 'GEEGG'
query_flank = '--GEEGG--'
query_flank2 = '--GE--GEEGG--'
query_interrupt = '--G-E-E-G-G-'
query_interrupt2 = '--G-E-G-E-E-G-G'
Python REGEX library could try the following with '* asterisk' or '.* period asterisk' to match anything in between:
import re
txt = "<to search>"
x = re.search("*G*E*E*G*G*", txt)
*** (updated answer below after rici comment)
import re
pattern = 'GEEGG'
query_flank = '--GEEGG--'
query_flank2 = '--GE--GEEGG--'
query_interrupt = '--G-E-E-G-G-'
query_interrupt2 = '--G-E-G-E-E-G-G'
txt = "--GEEGG--"
x = re.search("G*E*E*G*G", txt)
print("print x")
print(x)
import re
pattern = 'GEEGG'
query_flank = '--GEEGG--'
query_flank2 = '--GE--GEEGG--'
query_interrupt = '--G-E-E-G-G-'
query_interrupt2 = '--G-E-G-E-E-G-G'
txt = "--GEEGG--"
y = re.search("G.*E.*E.*G.*G*", txt)
print("print y")
print(y)
OUTPUT:
print x
<re.Match object; span=(2, 7), match='GEEGG'>
print y
<re.Match object; span=(2, 9), match='GEEGG--'>
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.