[英]How to parse nested comma-separated lists of parenthesized expressions
I know how to use the Python regex module to parse nested parentheses. 我知道如何使用Python正则表达式模块解析嵌套括号。 This regular expression 这个正则表达式
\(([^()]*+(?:(?R)[^()]*)*+)\)
correctly finds the outermost parentheses in 正确地找到最外面的括号
some (text)(text here(possible text)text(possible text(more text)))end text
I also know how to find items in a comma-separated list: 我也知道如何在逗号分隔的列表中查找项目:
[^,]+(?=,?)
matches correctly the elements of the list 正确匹配列表中的元素
dgad asg , adgda adg, a, g, asdgdg,dg sfg
But I need a combination of these two. 但是我需要将两者结合起来。 I need to parse the elements of a comma separated list, where the elements themselves may contain parantheses (with comma-separated lists in them). 我需要解析逗号分隔列表的元素,其中元素本身可能包含括号(其中包含逗号分隔的列表)。 In this list 在这个清单中
dg(dsfsd, (d,d,g)(g,as(d,f) fdg) sdfs, sf)ad asg , adgda (a) adg, a, g, asdgdg,dg sfg(f,g, (dff, d)df, g) kd
I need to identify the elements as: 我需要将元素标识为:
first: dg(dsfsd, (d,d,g)(g,as(d,f) fdg) sdfs, sf)ad asg
second: adgda (a) adg
third: a
fourth: g
fifth: asdgdg
sixth: dg sfg(f,g, (dff, d)df, g) kd
I don't know how to combine the two regular expressions. 我不知道如何结合两个正则表达式。 Could someone help me, please? 有人可以帮我吗? Thx. 谢谢。
You may use 您可以使用
r'(?>(\((?>[^()]*(?1)?)*\))|[^,])+'
See the regex demo 见正则表达式演示
Details 细节
(?>(\\((?>[^()]*(?1)?)*\\))|[^,])+
- 1 or more occurrences of (to avoid empty string matches): (?>(\\((?>[^()]*(?1)?)*\\))|[^,])+
-1次或多次出现(以避免空字符串匹配):
(\\((?>[^()]*(?1)?)*\\))
- Capturing group 1 (defined to be able to use a subroutine) matching: (\\((?>[^()]*(?1)?)*\\))
-捕获组1(定义为能够使用子例程)匹配:
\\(
- a (
\\(
-一个(
(?>[^()]*(?1)?)*
- any 0+ chars other than (
and )
followed with an optional whole Group 1 pattern (recursed here) (?>[^()]*(?1)?)*
- (
和)
以外的任何0+字符,后跟可选的整个第1组模式(在此处递归) \\)
- a )
\\)
-a )
|
- or - 要么 [^,]
- any char but ,
[^,]
-除了,
任何字符,
Python demo: Python演示:
import regex as re
rx = r"(?>(\((?>[^()]*(?1)?)*\))|[^,])+"
s = "dg(dsfsd, (d,d,g)(g,as(d,f) fdg) sdfs, sf)ad asg , adgda (a) adg, a, g, asdgdg,dg sfg(f,g, (dff, d)df, g) kd"
matches = re.finditer(rx, s)
for m in matches:
print(m.group().strip())
Output: 输出:
dg(dsfsd, (d,d,g)(g,as(d,f) fdg) sdfs, sf)ad asg
adgda (a) adg
a
g
asdgdg
dg sfg(f,g, (dff, d)df, g) kd
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.