简体   繁体   English

正则表达式使用Go编程语言查找命名捕获组

[英]Regex to find named capturing groups with Go programming language

I'm looking for a regex to find named capturing groups in (other) regex strings. 我正在寻找一个正则表达式来找到(其他)正则表达式字符串中的命名捕获组。

Example: I want to find (?P<country>m((a|b).+)n) , (?P<city>.+) and (?P<street>(5|6)\\. .+) in the following regex: 示例:我想找到(?P<country>m((a|b).+)n)(?P<city>.+)(?P<street>(5|6)\\. .+)在以下正则表达式中:

/(?P<country>m((a|b).+)n)/(?P<city>.+)/(?P<street>(5|6)\. .+)

I tried the following regex to find the named capturing groups: 我尝试了以下正则表达式来查找命名的捕获组:

var subGroups string = `(\(.+\))*?`
var prefixedSubGroups string = `.+` + subGroups
var postfixedSubGroups string = subGroups + `.+`
var surroundedSubGroups string = `.+` + subGroups + `.+`
var capturingGroupNameRegex *regexp.RichRegexp = regexp.MustCompile(
    `(?U)` + 
    `\(\?P<.+>` + 
    `(` +   prefixedSubGroups + `|` + postfixedSubGroups + `|` + surroundedSubGroups + `)` + 
    `\)`) 

?U makes greedy quantifiers( + and * ) non-greedy, and non-greedy quantifiers ( *? ) greedy. ?U贪婪量词( +* )非贪婪,非贪婪量词( *? )贪婪。 Details in the Go regex documentation . Go正则表达式文档中的详细信息。

But it doesn't work because parenthesis are not matched correctly. 但它不起作用,因为括号不正确匹配。

Matching arbitrarily nested parentheses correctly is not possible with regular expressions because arbitrary (recursive) nesting cannot be described by a regular language. 正则表达式无法正确匹配任意嵌套的括号,因为常规语言无法描述任意(递归)嵌套。

Some modern regex flavor do support recursion (Perl, PCRE) or balanced matching (.NET), but Go is not one of them ( the docs explicitly say that Perl's (?R) construct is not supported by the RE2 library that Go's regex package appears to be based on ). 一些现代的正则表达式确实支持递归(Perl,PCRE)或平衡匹配(.NET),但Go不是其中之一( 文档明确指出 ,Go的正则表达式包RE2库不支持Perl的(?R)构造似乎是基于 )。 You need to build a recursive descent parser, not a regex. 您需要构建递归下降解析器,而不是正则表达式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM