简体   繁体   中英

extract substring using python regular expression

I am trying to extract substring like ***.ini from string. For example, I have

000012: 378:210 File=test1.ini  Cmd:send command1 
000512: 3378:990 File=test2.ini  Cmd:send command2 File=not.ini Cmd: include command

I need to extract the substring after the first "File=", and the substring after the first File=***.ini which is "Cmd: ..." till the end.

So the result I want is:

test1.ini
Cmd:send command1 

and

test2.ini  
Cmd:send command2 File=not.ini Cmd: include command

I tried:

re.match("(.*) File=(.*).ini(.*)Cmd:(.*)", line, re.M\re.I)

this works well with the first line, but for the second line, I get:

test2.ini  Cmd:send command2 File=not.ini  #which is wrong, wanted is: 

test.ini

Cmd: include command

Anyone please help. Thanks. LJ

You can use this regex with re.findall function:

\bFile=(.+?\.ini)\s+(Cmd:.*)

RegEx Demo

Code:

p = re.compile(ur'\bFile=(.+?\.ini)\s+(Cmd:.*)')
print re.findall(p, input_str)

.* is too greedy, also there is no need to match from start of line. Try this

re.search("File=([^\.]+.ini).*?(Cmd:.*)", line).groups()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM