Python regex multiple matches with grouping

Question

Input String

<msgCode>1111</msgCode>asdasdad<errorId>2222</errorId>

What I want

(1111,2222)

If I use findall, this is what I get :

>>> import re;
>>> print re.findall("<(msgCode|errorId)>([0-9]+)</(msgCode|errorId)>","<msgCode>1111</msgCode>asdasdad<errorId>2222</errorId>");
[('msgCode', '1111', 'msgCode'), ('errorId', '2222', 'errorId')]

What I hope for is

[('1111','2222')]

Is there a easy way to do it using re instead of post-processing output ?

Answer 1

consider using xpath instead:

>>> from lxml import html
>>> root = html.fromstring('<msgCode>1111</msgCode>asdasdad<errorId>2222</errorId>')
>>> root.xpath('//*[self::msgcode or self::errorid]/text()')
['1111', '2222']

Answer 2

Use a Non-Capture group for the msgCode tags (?:msgCode|errorId)

>> import re
>> subject = "<msgCode>1111</msgCode>asdasdad<errorId>2222</errorId>"
>> result = re.findall("<(?:msgCode|errorId)>([0-9]+)</(?:msgCode|errorId)>", subject)
>> print result

['1111', '2222']

Python regex multiple matches with grouping

Question

2 answers

solution1
2 ACCPTED 2014-01-31 03:42:56

solution2
-1 2014-01-31 03:11:58

Python regex multiple matches with grouping

Question

2 answers

solution1 2 ACCPTED 2014-01-31 03:42:56

solution2 -1 2014-01-31 03:11:58

solution1
2 ACCPTED 2014-01-31 03:42:56

solution2
-1 2014-01-31 03:11:58