![](/img/trans.png)
[英]Extract string between two brackets, including nested brackets in python
[英]How to extract substrings between brackets while ignoring those between nested brackets in Python?
我有一個字符串:
phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'
如何僅提取括號之間的且每個子字符串中不包含任何括號的子字符串? 因此,從我的示例中,我需要兩個輸出:“ s2:0.4186036213,s3:0.4186036213”和“ s4:0.1429514535,s5:0.1429514535”。
您可以使用常規的表達式 :
import re
phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'
re.findall(r'\(([^\(\)]*)\)', phy)
# ['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']
這將捕獲所有非括號括在開閉括號中的內容。 但是,它不能驗證正確的嵌套級別。
嘗試這個:
from collections import defaultdict
bracket_dict = defaultdict(int)
bracket_dict_ ={
'(':')',
'{':'}',
'[':']'
}
bracket_dict.update(bracket_dict_)
bracket_list = bracket_dict.keys()
phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'
inner_items=[]
brackets = []
start_index = None
for i in range(len(phy)):
if phy[i] in bracket_list:
start_index = i
brackets.append(phy[i])
if brackets:
if phy[i] == bracket_dict[brackets[-1]]:
inner_items.append(phy[start_index+1 : i])
brackets.append(phy[i])
print(inner_items)
#['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']
使用正則表達式:
import re
reg = re.compile(r'[(]([^()]+)[)]')
phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654)'
print(reg.findall(phy))
輸出:
C:\Users\Desktop>py x.py
['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.