简体   繁体   中英

How to match the innermost parenthesis set using Python regex?

I'm trying to get the innermost sub-string for () in the below string using Python:

x = "a (b) (c (d) e"

what I want below sub-string as output

(d)

what I tried till now is as below

re.findall(r"\(.*?\)", x)
re.findall(r"\(.*\)", x)

but it gives me output as the outer strings and that is not useful. I want to match the innermost string which is available between ( ) . This example is part of another complex string and this string aptly displays my issue. I require the regex solution only with the parentheses.

The regex I use for this purpose are:

(\([^\(]*?\))

And here's it demonstrated at regex101

ie

groups = [m for m in re.finditer(r"(\([^\(]*?\))",text)]

This returns all deepest-level bracketed groups in a string.

For example the string:

"(Mary ( had a (little) ) lamb)"

This regex returns "(little)" .

In strings that contain separately bracketed sections, the regex will return all groups that are most deeply nested in their own locale.

eg

"(its) fleece (as (white) ) as snow."

Would return two groups, "(its)" and "(white)"

I use this for tokenising bracketed logic statements, and by tokenising only the deepest nested clauses at a time, replacing them with flattened tokens, I can iteratively parse an entire logic statement, until no remaining brackets are found.

It's worth ensuring the statement being parsed has all its brackets matched - eg in your opening statement, x = "a (b) (c (d) e" there's a missing bracket at the end - and should be closed off with a final ) , such as x = "a (b) (c (d) e)" .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM