简体   繁体   English

Python正则表达式找到子串

[英]Python regex to find substring

Sample text 示范文本

port0 interface GigabitEthernet0/4/0

port1 interface TenGigabitEthernet0/1/0

login delay 2

bfd-template single-hop BDI

ip ftp source-interface Loopback0

ip tftp source-interface Loopback0

interface Loopback0

interface Loopback100

interface Loopback999

description *** Loopback interface for management ***

interface TenGigabitEthernet0/0/0

mtu 9216

carrier-delay msec 0

interface TenGigabitEthernet0/1/0

mtu 9216

carrier-delay msec 0

interface GigabitEthernet0/4/0

mtu 9216

interface GigabitEthernet0/4/1

my regex is 我的正则表达式是

[T][e]((?:.|\n)*?[e][c]\s\d+)

and im verifying it at pythex.org 并在imthex.org验证它

it matches the below – 它匹配以下 -

TenGigabitEthernet0/1/0

mtu 9216

carrier-delay msec 0 

Which is what I want. 这就是我想要的。 But it also matches – 但它也匹配 -

TenGigabitEthernet0/1/0

login delay 2

bfd-template single-hop BDI

ip ftp source-interface Loopback0

ip tftp source-interface Loopback0

interface Loopback0

interface Loopback100

interface Loopback999

description *** Loopback interface for management ***

interface TenGigabitEthernet0/0/0

mtu 9216

carrier-delay msec 0

Which I don't want. 我不想要的。 I am looking for a multiline regex which exactly matches only all the tengig-mtu-carrier-delay part(s) in my string. 我正在寻找一个multiline regex ,它只与我的字符串中的所有tengig-mtu-carrier-delay part(s)完全匹配。

What I have written is - 我写的是 -

buffer_=open(file,"rb")
sb=buffer_.read().replace('\r\r\n','')
inf = re.compile(r'[T][e]((?:.|\n)*?[e][c]\s\d+)')
intf = inf.findall(sb)
print intf
buffer_.close()

and it works perfectly for files which have the tengig-mtu-carrier-delay in sequential lines, but not so perfectly. 它适用于在连续行中具有tengig-mtu-carrier-delay文件,但不是那么完美。 If there is any tengig also found elsewhere? 如果在其他地方还有任何tengig

I think this regex is what you want 我认为这个正则表达式是你想要的

(tengig.*?(?:\n+)?\bmtu\b.*?(?:\n+)?\bcarrier-delay\b[^\n]+)

Regex Demo 正则表达式演示

Regex Breakdown 正则表达式细分

( #Capturing group
 tengig #Match tengig literally
 .*? #Lazy matching to meet next next requirement
 (?:\n+)? #Match next \n (OPTIONAL)
 \bmtu\b #Match mtu literally
 .*? #Lazy matching to meet next requirement
 (?:\n+)? #Match next \n (OPTIONAL)
 \bcarrier-delay\b #Match carrier-delay literally
 [^\n]+ #Match anything till a new line 
) #End capturing group

Python Code Python代码

p = re.compile(r'(tengig.*?(?:\n+)?\bmtu\b.*?(?:\n+)?\bcarrier-delay\b[^\n]+)', re.MULTILINE | re.IGNORECASE)
test_str = "port0 interface GigabitEthernet0/4/0\n\nport1 interface TenGigabitEthernet0/1/0\n\nlogin delay 2\n\nbfd-template single-hop BDI\n\nip ftp source-interface Loopback0\n\nip tftp source-interface Loopback0\n\ninterface Loopback0\n\ninterface Loopback100\n\ninterface Loopback999\n\ndescription * Loopback interface for management *\n\ninterface TenGigabitEthernet0/0/0\n\nmtu 9216\n\ncarrier-delay msec 0\n\ninterface TenGigabitEthernet0/1/0\n\nmtu 9216\n\ncarrier-delay msec 0\n\ninterface GigabitEthernet0/4/0\n\nmtu 9216\n\ninterface GigabitEthernet0/4/1\n"

Ideone Demo Ideone演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM