How to use regular expressions in python?

Question

Hopefully someone can help, I'm trying to use a regular expression to extract something from a string that occurs after a pattern, but it's not working and I'm not sure why. The regex works fine in linux...

import re
s = "GeneID:5408878;gbkey=CDS;product=carboxynorspermidinedecarboxylase;protein_id=YP_001405731.1"
>>> x = re.search(r'(?<=protein_id=)[^;]*',s)
>>> print(x)
<_sre.SRE_Match object at 0x000000000345B7E8>

Answer 1

Use .group() on the search result to print the captured groups:

>>> print(x.group(0))
YP_001405731.1

As Martijn ~~has~~ had pointed out, you created a match object. The regular expression is correct. If it was wrong, print(x) would have printed None .

Answer 2

You should probably think about re-writing your regex so that you find all pairs so you don't have to muck around with specific groups and hard-coded look behinds...

import re
kv = dict(re.findall('(\w+)=([^;]+)', s))
# {'gbkey': 'CDS', 'product': 'carboxynorspermidinedecarboxylase', 'protein_id': 'YP_001405731.1'}
print kv['protein_id']
# YP_001405731.1

How to use regular expressions in python?

Question

2 answers

solution1
8 ACCPTED 2013-07-07 12:15:05

solution2
4 2013-07-07 12:25:22

How to use regular expressions in python?

Question

2 answers

solution1 8 ACCPTED 2013-07-07 12:15:05

solution2 4 2013-07-07 12:25:22

solution1
8 ACCPTED 2013-07-07 12:15:05

solution2
4 2013-07-07 12:25:22