[英]Python regex to find substring
Sample text 示范文本
port0 interface GigabitEthernet0/4/0
port1 interface TenGigabitEthernet0/1/0
login delay 2
bfd-template single-hop BDI
ip ftp source-interface Loopback0
ip tftp source-interface Loopback0
interface Loopback0
interface Loopback100
interface Loopback999
description *** Loopback interface for management ***
interface TenGigabitEthernet0/0/0
mtu 9216
carrier-delay msec 0
interface TenGigabitEthernet0/1/0
mtu 9216
carrier-delay msec 0
interface GigabitEthernet0/4/0
mtu 9216
interface GigabitEthernet0/4/1
my regex is 我的正则表达式是
[T][e]((?:.|\n)*?[e][c]\s\d+)
and im verifying it at pythex.org 并在imthex.org验证它
it matches the below – 它匹配以下 -
TenGigabitEthernet0/1/0
mtu 9216
carrier-delay msec 0
Which is what I want. 这就是我想要的。 But it also matches –
但它也匹配 -
TenGigabitEthernet0/1/0
login delay 2
bfd-template single-hop BDI
ip ftp source-interface Loopback0
ip tftp source-interface Loopback0
interface Loopback0
interface Loopback100
interface Loopback999
description *** Loopback interface for management ***
interface TenGigabitEthernet0/0/0
mtu 9216
carrier-delay msec 0
Which I don't want. 我不想要的。 I am looking for a
multiline regex
which exactly matches only all the tengig-mtu-carrier-delay part(s)
in my string. 我正在寻找一个
multiline regex
,它只与我的字符串中的所有tengig-mtu-carrier-delay part(s)
完全匹配。
What I have written is - 我写的是 -
buffer_=open(file,"rb")
sb=buffer_.read().replace('\r\r\n','')
inf = re.compile(r'[T][e]((?:.|\n)*?[e][c]\s\d+)')
intf = inf.findall(sb)
print intf
buffer_.close()
and it works perfectly for files which have the tengig-mtu-carrier-delay
in sequential lines, but not so perfectly. 它适用于在连续行中具有
tengig-mtu-carrier-delay
文件,但不是那么完美。 If there is any tengig
also found elsewhere? 如果在其他地方还有任何
tengig
?
I think this regex is what you want 我认为这个正则表达式是你想要的
(tengig.*?(?:\n+)?\bmtu\b.*?(?:\n+)?\bcarrier-delay\b[^\n]+)
Regex Breakdown 正则表达式细分
( #Capturing group
tengig #Match tengig literally
.*? #Lazy matching to meet next next requirement
(?:\n+)? #Match next \n (OPTIONAL)
\bmtu\b #Match mtu literally
.*? #Lazy matching to meet next requirement
(?:\n+)? #Match next \n (OPTIONAL)
\bcarrier-delay\b #Match carrier-delay literally
[^\n]+ #Match anything till a new line
) #End capturing group
Python Code Python代码
p = re.compile(r'(tengig.*?(?:\n+)?\bmtu\b.*?(?:\n+)?\bcarrier-delay\b[^\n]+)', re.MULTILINE | re.IGNORECASE)
test_str = "port0 interface GigabitEthernet0/4/0\n\nport1 interface TenGigabitEthernet0/1/0\n\nlogin delay 2\n\nbfd-template single-hop BDI\n\nip ftp source-interface Loopback0\n\nip tftp source-interface Loopback0\n\ninterface Loopback0\n\ninterface Loopback100\n\ninterface Loopback999\n\ndescription * Loopback interface for management *\n\ninterface TenGigabitEthernet0/0/0\n\nmtu 9216\n\ncarrier-delay msec 0\n\ninterface TenGigabitEthernet0/1/0\n\nmtu 9216\n\ncarrier-delay msec 0\n\ninterface GigabitEthernet0/4/0\n\nmtu 9216\n\ninterface GigabitEthernet0/4/1\n"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.