简体   繁体   English

在两个单词之间拆分字符串

[英]Splitting a string between two words

The following string下面的字符串

text = 'FortyGigE1/0/53\r\nCurrent state: DOWN\r\nLine protocol state: DOWN\r\n\r\nFortyGigE1/0/54\r\nCurrent state: DOWN\r\nLine protocol state: DOWN\r\n\r\n'

should be split into this:应该拆分成这样:

output = [
    'FortyGigE1/0/53\r\nCurrent state: DOWN\r\nLine protocol state: DOWN\r\n\r\n',
    'FortyGigE1/0/54\r\nCurrent state: DOWN\r\nLine protocol state: DOWN\r\n\r\n'
]

The delimiters should not be deleted after the splitting.拆分后不应删除分隔符。

delimiters = '(GigabitEthernet\d*/\d*/\d*\s.*|FortyGigE\d*/\d*/\d*\s.*)'

I tried to do this:我试图这样做:

output = re.split(delimiters, text)

But my output will be this, with many more splits than I expected:但我的输出将是这样的,比我预期的要多得多:

['',
 'FortyGigE1/0/53\r', '\nCurrent state: DOWN\r\nLine protocol state: DOWN\r\n\r\n',
 'FortyGigE1/0/54\r', '\nCurrent state: DOWN\r\nLine protocol state: DOWN\r\n\r\n']

At least with your example, you can do:至少在您的示例中,您可以执行以下操作:

>>> re.split(r'(?<=DOWN\r\n\r\n)(?=FortyGigE)', text)
['FortyGigE1/0/53\r\nCurrent state: DOWN\r\nLine protocol state: DOWN\r\n\r\n',
 'FortyGigE1/0/54\r\nCurrent state: DOWN\r\nLine protocol state: DOWN\r\n\r\n']

Comparing to your stated desired output:与您声明的所需输出相比:

>>> output==re.split(r'(?<=DOWN\r\n\r\n)(?=FortyGigE)', text)
True

It works by using a zero width lookback (?<=DOWN\\r\\n\\r\\n) and a zero width lookahead (?=FortyGigE) as the point to split.它通过使用零宽度回顾(?<=DOWN\\r\\n\\r\\n)和零宽度(?=FortyGigE)作为拆分点来工作。

Here is a regex101 demo ;这是一个 regex101 演示 the \\r are removed since they are not supported on that platform. \\r被删除,因为它们在该平台上不受支持。

your tip gave me the solution for my problem.你的提示给了我解决我的问题的方法。 Here the excerpt of my script:这是我的脚本的摘录:

f = open(file, "r")
content = f.read()
f.close()
#
# This deliminator is only an example. The interface names are much longer
deliminators = r'(?=\nBridge-Aggregation|\nHundredGigE|\nFortyGigE|\nTen-GigabitEthernet)'
#
dev_interfaces = re.split(deliminators, content)
max_interfaces = len(dev_interfaces)
# Delete the beginning Linefeed (\n) of each interface
dev_interfaces[index] = dev_interfaces[index].lstrip('\n') 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM