[英]Regular Expression not working when same condition is nested
Background: I'm just fiddling around with an idea for simple templating which only provides if/for/render, to see how feasible it is and if it makes sense to use in my personal project.背景:我只是摆弄一个简单模板的想法,它只提供 if/for/render,看看它的可行性以及在我的个人项目中使用是否有意义。 As opposed to using NVelocity or Razor, or anything else.与使用 NVelocity 或 Razor 或其他任何东西相反。
I've written a regular expression:我写了一个正则表达式:
(?:(?:(?<open>\[if (?<if>[a-zA-Z0-9\.]+)\])(?<content>[^\[]*))+(?:[^\[]*(?<close-open>\[end if\]))+)+(?(open)(?!))
And when used with the sample text:当与示例文本一起使用时:
<div>
[if variable3]{{variable3}}[end if]
</div>
<div>
[if variable1]
{{variable1}}
[if variable2]
<br>
{{variable2}}
[end if]
[end if]
</div>
It's working as expected.它按预期工作。 I get 2 matches, and if the 2nd match is valid I can parse the inner capture.我得到 2 个匹配项,如果第 2 个匹配项有效,我可以解析内部捕获。
Problem is if i have multiple nested matches.问题是我是否有多个嵌套匹配项。 So given:所以给出:
<div>
[if variable3]{{variable3}}[end if]
</div>
<div>
[if variable1]
{{variable1}}
[if variable2]
<br>
{{variable2}}
[end if]
[if variable4]
<br>
{{variable4}}
[end if]
[if variable5]
<br>
{{variable5}}
[end if]
[end if]
</div>
What I end up with is the first capture being correct, and then all 3 individual captures and not the outer one for the 2nd match.我最终得到的是第一个捕获是正确的,然后是所有 3 个单独的捕获,而不是第二场比赛的外部捕获。
If I expand the capture to ignore \\[
for the inner content, it causes the first and second match to combine into a single match.如果我扩展捕获以忽略内部内容的\\[
,则会导致第一个和第二个匹配项合并为一个匹配项。 :( :(
Does anyone know how to fix this?有谁知道如何解决这一问题? (and if you have a better idea of how to do this templating would be keen to know in the comments) (如果您对如何进行此模板有更好的了解,将很想在评论中了解)
You may use您可以使用
@"(?s)\[if\s+(?<if>[^][]+)](?<fullBody>(?>(?:(?!\[if\s|\[end\ if]).)+|(?<-open>)\[end\ if]|(?<open>)\[if\s+(?<if>[^][]+)])*(?(open)(?!)))\[end\ if]"
See the regex demo .请参阅正则表达式演示。
Details (note that you may use it inside C# code due to the x modifier):详细信息(请注意,由于 x 修饰符,您可以在 C# 代码中使用它):
@"(?sx) # Singleline and IgnorePatternWhitespace flags on
\[if\s+ # "[if" and then 1+ whitespaces
(?<if>[^][]+) # "If" group: one or more chars other than "]"
] # a "]" char
(?<fullBody> # Group "fullBody" containing all nested if blocks
(?> # Start of an atomic group
(?:(?!\[if\s|\[end\ if]).)+| # any char, 1+ occurrences, that does not start the "[if " or "[end if]" substring, or...
(?<-open>)\[end\ if]| # "[end if]" substring and an item is popped from Group "open", or
(?<open>)\[if\s+(?<if>[^][]+)] # Group "open": "[if", 1+ whitespaces, Group "if": 1+ chars other than "[" and "]", and then a "]" char
)* # repeat atomic group patterns 0 or more times
(?(open)(?!)) # A conditional: if Group "open" has any items on its stack, fail and backtrack
) # End of fullBody group
\[end\ if]" # "[end if]" substring
If you do not care if an if block is nested in which block, you may plainly get a full list of if blocks using a variation of this regex:如果您不关心 if 块是否嵌套在哪个块中,您可以使用此正则表达式的变体来清楚地获得 if 块的完整列表:
var pattern = @"(?s)(?=(?<ifBlock>\[if\s+(?<if>[^][]+)](?<fullBody>(?>(?:(?!\[if\s|\[end\ if]).)+|(?<-open>)\[end\ if]|(?<open>)\[if\s+(?<if>[^][]+)])*(?(open)(?!)))\[end\ if]))";
The pattern above is just wrapped with another named capturing group and is placed inside a positive lookahead.上面的模式只是用另一个命名的捕获组包裹,并放置在正向前瞻中。 While the match value will always be empty, groups will hold all the values you may need.虽然匹配值始终为空,但组将包含您可能需要的所有值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.