簡體   English   中英

查找括號之間文本的正則表達式錯誤

[英]Error in regular expression to find the text between parenthesis

我有一個弦

string  ='((clearance) AND (embedded) AND (software engineer OR developer)) AND (embedded)'

我想根據括號細分為列表,因此請參考給定的解決方案

my_data = re.findall(r"(\(.*?\))",string)

但是當我打印my_data時,輸出為(len = 4)

['((clearance)', '(embedded)', '(software engineer OR developer)', '(embedded)']

但我想要的輸出是(len = 2)

['(clearance) AND (embedded) AND (software engineer OR developer)', '(embedded)']

因為“(清除)AND(嵌入式)AND(軟件工程師或開發人員)”在一個括號中,而“嵌入式”在另一個括號中。 但是“ re.findall”分為4個列表,為什么?

如果我想要我想要的輸出,如何修改正則表達式?

在純正則表達式中,這是不可能的,因此以下是一個帶有括號的想法:

def find_stuff(string):
    indices = []
    counter = 0
    change = {"(":1, ")":-1}
    for i, el in enumerate(string):
        new_count = counter + change.get(el, 0)
        if counter==0 and new_count==1:
            indices.append(i)
        elif counter==1 and new_count==0:
            indices.append(i+1)
        counter = new_count
    return indices

這不是很漂亮,但我認為概念很明確。 它返回外部括號的索引,因此您可以使用以下內容對字符串進行切片

有點re破解,但這是可能的:

>>> string  ='((clearance) AND (embedded) AND (software engineer OR developer)) AND (embedded)'
>>> [e for e in re.split(r'\((?=\()(.*?)(?<=\))\)|(?<!\()(\([^()]+\))(?!\))',string) if e and '(' in e and ')' in e]
['(clearance) AND (embedded) AND (software engineer OR developer)', '(embedded)']

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM