簡體   English   中英

Python:在正則表達式中使用字符串變量作為搜索模式

[英]Python: use string variable as search pattern in regex

我正在嘗試使用正則表達式搜索用戶定義的模式的核苷酸序列(僅由A,C,G,T組成):

相關代碼如下:

    match = re.match(r'{0}'.format(pattern), sequence)

match總是返回None,我需要它來返回與用戶查詢匹配的部分序列...

我究竟做錯了什么?

編輯:這就是我構造搜索模式的方式:

   askMotif = raw_input('Enter a motif to search for it in the sequence (The wildcard character ‘?’ represents any nucleotide in that position, and * represents none or many nucleotides in that position.): ')
listMotif= []    
letterlist = ['A','C','G','T', 'a', 'c','g','t']
for letter in askMotif:
    if letter in letterlist:
        a = letter.capitalize()
        listMotif.append(a)
    if letter == '?':
        listMotif.append('.')
    if letter == '*':
        listMotif.append('*?')
pattern = ''
for searcher in listMotif:
    pattern+=searcher

不是很pythonic,我知道...

那應該工作正常:

>>> tgt='AGAGAGAGACGTACACAC'
>>> re.match(r'{}'.format('ACGT'), tgt)
>>> re.search(r'{}'.format('ACGT'), tgt)
<_sre.SRE_Match object at 0x10a5d6920>

我認為這可能是因為您是要使用搜索還是匹配


提示您發布的代碼:

prompt='''\
    Enter a motif to search for it in the sequence 
    (The wildcard character '?' represents any nucleotide in that position, 
     and * represents none or many nucleotides in that position.)
'''
pattern=None
while pattern==None:
    print prompt
    user_input=raw_input('>>> ')
    letterlist = ['A','C','G','T', '?', '*']
    user_input=user_input.upper()
    if len(user_input)>1 and all(c in letterlist for c in user_input):
        pattern=user_input.replace('?', '.').replace('*', '.*?')
    else:
        print 'Bad pattern, please try again'

re.match()僅在序列的開頭匹配。 也許您需要re.search()

>>> re.match(r'{0}'.format('bar'), 'foobar').group(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module> 
AttributeError: 'NoneType' object has no attribute 'group'
>>> re.search(r'{0}'.format('bar'), 'foobar').group(0)
'bar'

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM