简体   繁体   中英

python regex for extracting data without

Following is my input file

input.txt

min=1310ns median=1344ns max=1399ns first=2280ns
min=1293ns median=1331ns max=18400ns first=2284ns
min=1277ns median=1302ns max=1346ns first=2363ns

my python code

import re

input_file = open("input.txt", "r")
output_file = open("output.data", "w")

for line in input_file:
    match_defines = re.match(r'\s*min=([0-9]+)', line) # works fine
    match_defines = re.match(r'\s*min=([0-9]+) median=([0-9]+) max=([0-9]+) first=([0-9]+)', line) # this doesn't work. 

    if match_defines:
        newline1= "\n %s\t%s\t%s\t%s\n" % (match_defines.group(1), match_defines.group(2), match_defines.group(3), match_defines.group(4))
    output_file.write(newline1)

    else:
        output_file.write(line)

My expected result is

1310   1344   1399   2280
1293   1331   18400  2284
1277   1302   1346   2363

How do i modify my regex to get this.

Thanks for your answers.

You forgot to add ns in the regex:

\s*min=([0-9]+)ns median=([0-9]+)ns max=([0-9]+)ns first=([0-9]+)
               ^^                ^^             ^^

See regex demo

I suggest using named capture groups to make it easier to access the captures and perhaps use \\s+ instead of literal spaces:

\s*min=(?P<min>[0-9]+)ns\s+median=(?P<median>[0-9]+)ns\s+max=(?P<max>[0-9]+)ns\s+first=(?P<first>[0-9]+)

See another demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM