简体   繁体   English

如何解析带括号的表达式的嵌套逗号分隔列表

[英]How to parse nested comma-separated lists of parenthesized expressions

I know how to use the Python regex module to parse nested parentheses. 我知道如何使用Python正则表达式模块解析嵌套括号。 This regular expression 这个正则表达式

\(([^()]*+(?:(?R)[^()]*)*+)\)

correctly finds the outermost parentheses in 正确地找到最外面的括号

some (text)(text here(possible text)text(possible text(more text)))end text

(example here ) 此处为示例)

I also know how to find items in a comma-separated list: 我也知道如何在逗号分隔的列表中查找项目:

[^,]+(?=,?)

matches correctly the elements of the list 正确匹配列表中的元素

dgad asg , adgda adg, a, g, asdgdg,dg sfg

(see here ) (请参阅此处

But I need a combination of these two. 但是我需要将两者结合起来。 I need to parse the elements of a comma separated list, where the elements themselves may contain parantheses (with comma-separated lists in them). 我需要解析逗号分隔列表的元素,其中元素本身可能包含括号(其中包含逗号分隔的列表)。 In this list 在这个清单中

dg(dsfsd, (d,d,g)(g,as(d,f) fdg) sdfs, sf)ad asg , adgda (a) adg, a, g, asdgdg,dg sfg(f,g, (dff, d)df, g) kd

I need to identify the elements as: 我需要将元素标识为:

first: dg(dsfsd, (d,d,g)(g,as(d,f) fdg) sdfs, sf)ad asg
second: adgda (a) adg
third: a
fourth: g
fifth: asdgdg
sixth: dg sfg(f,g, (dff, d)df, g) kd

I don't know how to combine the two regular expressions. 我不知道如何结合两个正则表达式。 Could someone help me, please? 有人可以帮我吗? Thx. 谢谢。

You may use 您可以使用

r'(?>(\((?>[^()]*(?1)?)*\))|[^,])+'

See the regex demo 正则表达式演示

Details 细节

  • (?>(\\((?>[^()]*(?1)?)*\\))|[^,])+ - 1 or more occurrences of (to avoid empty string matches): (?>(\\((?>[^()]*(?1)?)*\\))|[^,])+ -1次或多次出现(以避免空字符串匹配):
    • (\\((?>[^()]*(?1)?)*\\)) - Capturing group 1 (defined to be able to use a subroutine) matching: (\\((?>[^()]*(?1)?)*\\)) -捕获组1(定义为能够使用子例程)匹配:
      • \\( - a ( \\( -一个(
      • (?>[^()]*(?1)?)* - any 0+ chars other than ( and ) followed with an optional whole Group 1 pattern (recursed here) (?>[^()]*(?1)?)* - ()以外的任何0+字符,后跟可选的整个第1组模式(在此处递归)
      • \\) - a ) \\) -a )
    • | - or - 要么
    • [^,] - any char but , [^,] -除了,任何字符,

Python demo: Python演示:

import regex as re

rx = r"(?>(\((?>[^()]*(?1)?)*\))|[^,])+"
s = "dg(dsfsd, (d,d,g)(g,as(d,f) fdg) sdfs, sf)ad asg , adgda (a) adg, a, g, asdgdg,dg sfg(f,g, (dff, d)df, g) kd"
matches = re.finditer(rx, s)
for m in matches:
    print(m.group().strip())

Output: 输出:

dg(dsfsd, (d,d,g)(g,as(d,f) fdg) sdfs, sf)ad asg
adgda (a) adg
a
g
asdgdg
dg sfg(f,g, (dff, d)df, g) kd

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM