python regex將行開頭的定義單詞與另一行的定義單詞之間的所有內容匹配

Question

我有如下文件，它是配置的一部分，其中包含對ruledefs（即rd-6）的引用。 除了rulebase和ruledefs名稱外，配置文件結構始終看起來相同。 這部分是rulebase-definition（出於這個問題的目的，這也是我的RB-definitions.txt）

##Rulebase-definition  
rulebase bb
      action priority 6 dynamic-only ruledef rd-6 charging-action throttle monitoring-key 1
      action priority 7 dynamic-only ruledef rd-7 charging-action p2p_Drop
      action priority 139 dynamic-only ruledef rd-8 charging-action p2p_Drop monitoring-key 1
#exit

這是ruledef-definition示例（這也是我在提出此問題時所尋找的輸出）

##Ruledef-definition
ruledef rd-8
          ip server-ip-address range host-pool BB10_RIM_1
          ip server-ip-address range host-pool BB10_RIM_2
#exit
ruledef rd-3
          ip any-match = TRUE
#exit

我能夠匹配raw_input（）給出的特定規則庫名稱（具有規則庫定義），並將其保存到文件RB-definitions.txt中，如上所示。 我還能夠匹配RB-definitions.txt中的ruledef名稱（但只有名稱），並將其存儲在ruledef_list中，如下所示

RDFile = open('RB-definitions.txt')
txt2 = RDFile.read()
ruledef_list = []
for match2 in re.findall((?<=ruledef)((?:.|\n)*?)(?=charging-action), txt2):
    print match2 +"\n" 
    ruledef_list.append(match2)

但是，當我必須從上圖所示的ruledef-defitnition中匹配特定的ruledef時，我仍然失敗。 Ruledef字始終排在第一位

start_tag =    '^ruledef ' #additional space char
content = '((?:.|\n)*?)'                                
end_tag = '#exit'

for RD_name in ruledef_list:
 print RD_name
 for match in re.findall(start_tag + RD_name + content + end_tag, txt):
    print match + end_tag + "\n"

我嘗試使用'^ ruledef'，'^ ruledef \\ s +'甚至是'（[ruledef]）\\ b'，但是這些都不起作用。 我必須對第一個單詞進行數學運算，因為如果沒有，我還將匹配“ ruledef”開頭的rulebase-defitnition的一部分。

如何匹配下一個“ #exit”行中定義的第一個單詞之間的所有內容？ 所以作為輸出我可以得到以下內容

ruledef rd-8
      ip server-ip-address range host-pool BB10_RIM_1
      ip server-ip-address range host-pool BB10_RIM_2
#exit
ruledef rd-3
      ip any-match = TRUE
#exit

為了更好地理解，請在此處http://pastebin.com/q3VUeAdh中找到帶有示例配置的整個腳本

Answer 1

您缺少多行模式。 否則， ^僅在整個字符串的開頭匹配。 另外，您可以通過使用單行/ dotall模式（使.匹配任何字符）來避免(?:.|\\n) ）：

start_tag = r'^ruledef ' #additional space char
content = r'(.*?)'                                
end_tag = r'#exit'

...

for match in re.findall(start_tag + RD_name + content + end_tag, txt, re.M|re.S):
    ...

請注意，這將為您提供ruledef的內容（即，僅是content部分匹配的content -無ruledef ，無名稱，無#exit). If this is not what you want, simply remove the parentheses in #exit). If this is not what you want, simply remove the parentheses in即可：

...
content = r'.*?'
...

順便說一句，使用負前瞻而不是貪婪的量詞可能會更有效（但是不必這樣做-如果速度是您的重要考慮因素，請對此進行簡要介紹）：

...
content = r'(?:(?!#exit).)*'
...

最后，請注意我如何對所有正則表達式模式使用原始字符串。 這只是Python中的好習慣-否則您可能會遇到復雜的轉義模式問題（即，您必須對某些事情進行兩次轉義）。

python regex將行開頭的定義單詞與另一行的定義單詞之間的所有內容匹配

問題描述

1 個解決方案

解決方案1
2 2013-06-30 16:04:02

python regex將行開頭的定義單詞與另一行的定義單詞之間的所有內容匹配

問題描述

1 個解決方案

解決方案1 2 2013-06-30 16:04:02

解決方案1
2 2013-06-30 16:04:02