简体   繁体   中英

Python regex to find substring

Sample text

port0 interface GigabitEthernet0/4/0

port1 interface TenGigabitEthernet0/1/0

login delay 2

bfd-template single-hop BDI

ip ftp source-interface Loopback0

ip tftp source-interface Loopback0

interface Loopback0

interface Loopback100

interface Loopback999

description *** Loopback interface for management ***

interface TenGigabitEthernet0/0/0

mtu 9216

carrier-delay msec 0

interface TenGigabitEthernet0/1/0

mtu 9216

carrier-delay msec 0

interface GigabitEthernet0/4/0

mtu 9216

interface GigabitEthernet0/4/1

my regex is

[T][e]((?:.|\n)*?[e][c]\s\d+)

and im verifying it at pythex.org

it matches the below –

TenGigabitEthernet0/1/0

mtu 9216

carrier-delay msec 0 

Which is what I want. But it also matches –

TenGigabitEthernet0/1/0

login delay 2

bfd-template single-hop BDI

ip ftp source-interface Loopback0

ip tftp source-interface Loopback0

interface Loopback0

interface Loopback100

interface Loopback999

description *** Loopback interface for management ***

interface TenGigabitEthernet0/0/0

mtu 9216

carrier-delay msec 0

Which I don't want. I am looking for a multiline regex which exactly matches only all the tengig-mtu-carrier-delay part(s) in my string.

What I have written is -

buffer_=open(file,"rb")
sb=buffer_.read().replace('\r\r\n','')
inf = re.compile(r'[T][e]((?:.|\n)*?[e][c]\s\d+)')
intf = inf.findall(sb)
print intf
buffer_.close()

and it works perfectly for files which have the tengig-mtu-carrier-delay in sequential lines, but not so perfectly. If there is any tengig also found elsewhere?

I think this regex is what you want

(tengig.*?(?:\n+)?\bmtu\b.*?(?:\n+)?\bcarrier-delay\b[^\n]+)

Regex Demo

Regex Breakdown

( #Capturing group
 tengig #Match tengig literally
 .*? #Lazy matching to meet next next requirement
 (?:\n+)? #Match next \n (OPTIONAL)
 \bmtu\b #Match mtu literally
 .*? #Lazy matching to meet next requirement
 (?:\n+)? #Match next \n (OPTIONAL)
 \bcarrier-delay\b #Match carrier-delay literally
 [^\n]+ #Match anything till a new line 
) #End capturing group

Python Code

p = re.compile(r'(tengig.*?(?:\n+)?\bmtu\b.*?(?:\n+)?\bcarrier-delay\b[^\n]+)', re.MULTILINE | re.IGNORECASE)
test_str = "port0 interface GigabitEthernet0/4/0\n\nport1 interface TenGigabitEthernet0/1/0\n\nlogin delay 2\n\nbfd-template single-hop BDI\n\nip ftp source-interface Loopback0\n\nip tftp source-interface Loopback0\n\ninterface Loopback0\n\ninterface Loopback100\n\ninterface Loopback999\n\ndescription * Loopback interface for management *\n\ninterface TenGigabitEthernet0/0/0\n\nmtu 9216\n\ncarrier-delay msec 0\n\ninterface TenGigabitEthernet0/1/0\n\nmtu 9216\n\ncarrier-delay msec 0\n\ninterface GigabitEthernet0/4/0\n\nmtu 9216\n\ninterface GigabitEthernet0/4/1\n"

Ideone Demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM