How to fix this regular expression in python?

Question

I want to process some string date which print out like this

'node0, node1 0.04, node8 11.11, node14 72.21\n'
'node1, node46 1247.25, node6 20.59, node13 64.94\n'

I want to find all the floating points here , this is the code I use

for node in nodes
    pattern= re.compile('(?<!node)\d+.\d+')
    distance = pattern.findall(node)

however the result is like this

['0.04', '11.11', '4 72']

while what i want is this

['0.04', '11.11', '72.21']

Any suggestion on fixing this regular expression?

Answer 1

The . in your expression is unescaped.

for node in nodes:
    pattern = re.compile(r"(?<!node)\d+\.\d+")
    distance = pattern.findall(node)

Answer 2

In regular expressions, a . character is interpreted as a wildcard character and can match (almost) any character. Thus your search pattern actually allows a digit or set of digits, followed by any character, followed by another digit or set of digits. To stop this interpretation of the dot character, escape it with a backslash \\ .

(An aside: You don't need to compile your regex pattern inside your loop. In fact, that will slow your code down.)

pattern = re.compile('(?<!node)\d+\.\d+')
for node in nodes:
    distance = pattern.findall(node)
    print distance

output:

['0.04', '11.11', '72.21']
['1247.25', '20.59', '64.94']

How to fix this regular expression in python?

Question

2 answers

solution1
4 2015-05-29 19:13:07

solution2
4 ACCPTED 2015-05-29 19:19:12

How to fix this regular expression in python?

Question

2 answers

solution1 4 2015-05-29 19:13:07

solution2 4 ACCPTED 2015-05-29 19:19:12

solution1
4 2015-05-29 19:13:07

solution2
4 ACCPTED 2015-05-29 19:19:12