简体   繁体   中英

match numbers if line starts with keyword

I've got a file that looks like this:

foo: 11.00 12.00  bar 13.00
bar: 11.00 12.00 bar
foo: 11.00 12.00

and would like to extract all numbers in lines beginning with the keyword "foo:". Expected result:

['11.00', '12.00', '13.00']
['11.00', '12.00']

Now, this is easy, if I use two regexes, like this:

    if re.match('^foo:', line):
        re.findall('\d+\.\d+', line)

but I was wondering, if it is possible to combine these into a single regex?

Thanks for your help, MD

Not exactly what you asked for, but since it's recommended to use standard Python tools instead of regexes where possible, I'd do something like this:

import re

with open('numbers.txt', 'r') as f:
    [re.findall(r'\d+\.\d+', line) for line in f if line.startswith('foo')]

UPDATE

And this will return the numbers after 'foo' even if it's anywhere in the string rather than just in the beginning:

with open('numbers.txt', 'r') as f:
    [re.findall(r'\d+\.\d+', line.partition('foo')[2]) for line in f]

If all lines in the file always have the same number of numbers, you can use the following regex:

"^foo:[^\d]*(\d*\.\d*)[^\d]*(\d*\.\d*)[^\d]*(\d*\.\d*)"

Example:

>>> import re
>>> line = "foo: 11.00 12.00 bar 13.00"
>>> re.match("^foo:[^\d]*(\d*\.\d*)[^\d]*(\d*\.\d*)[^\d]*(\d*\.\d*)", line).groups()
('11.00', '12.00', '13.00')
>>> 

Using parentheses around a part of the regular expression makes it into a group that can be extracted from the match object. See the Python documentation for more information.

You can do without the first regexp and instead filter lines in a list comprehension by comparing the first four characters of the line, and compile the inner regexp:

import re

with open("input.txt", "r") as inp:
    prog=re.compile("\d+\.\d+")
    results=[prog.findall(line) for line in inp if line[:4]=="foo:"]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM