What is wrong with this regex match in python?

Question

I am having issues with matching this particular regex in python, can someone see what is wrong?

Sample strings I am trying to match with a single regular expression are:

string = '[Pre-Avatar Mode Cost: 5.50 MP]'
string = '[Pre-Avatar Mode Cost: 1.2 MP]'
string = '[Pre-Avatar Mode Cost: 0.5 MP]'
string = '[Post-Avatar Mode: 0 MP]'

I have tried the following, but there doesnt seem to be a single expression that matches all of them:

m = re.match('\[.*(?P<cost>\d+(\.\d+)).*\]', string) # Appears to match only ones with #.#
m = re.match('\[.*(?P<cost>\d+(\.\d+)?).*\]', string) # Appears to match the 0 only, unable to print out m.groups for the others

I am trying to catch (5.50, 1.2, 0.5, 0, etc.)

Answer 1

You need to make the first .* match non-greedy (add a ? ), it'll swallow the numbers otherwise:

r'\[.*?(?P<cost>\d+(?:\.\d+)?).*\]'

I've also made the optional .number part a non-capturing group to simplify processing the output:

>>> import re
>>> costre = re.compile(r'\[.*?(?P<cost>\d+(?:\.\d+)?).*\]')
>>> costre.match('[Post-Avatar Mode: 0 MP]').groups()
('0',)
>>> costre.match('[Post-Avatar Mode: 5.50 MP]').groups()
('5.50',)
>>> costre.match('[Post-Avatar Mode: 1.2 MP]').groups()
('1.2',)

Answer 2

I'd suggest using the : as the anchor. That way, you get a more robust expression:

r'\[.*: (?P<cost>\d+(?:\.\d+)?).*\]'

You might even want to add on the MP suffix if it's guaranteed to be in the text.

What is wrong with this regex match in python?

Question

2 answers

solution1
2 ACCPTED 2012-10-29 17:21:58

solution2
1 2012-10-29 17:24:35

What is wrong with this regex match in python?

Question

2 answers

solution1 2 ACCPTED 2012-10-29 17:21:58

solution2 1 2012-10-29 17:24:35

solution1
2 ACCPTED 2012-10-29 17:21:58

solution2
1 2012-10-29 17:24:35