简体   繁体   English

匹配嵌套方括号标记的正则表达式是什么?

[英]What is the regular expression to match nested square bracket tags?

I created a regular expression pattern that matches square bracket, Wiki-type tags like the following: 我创建了一个与方括号,Wiki类型的标签相匹配的正则表达式模式,如下所示:

[h1]Some content[/h1]
[b]some more content[/b]
[i]some more still[/i]

Here is a scenario: 这是一个场景:

This [b]sentence[/b] is just an [b][i]example[/i][/b].

Here is the pattern: 这是模式:

\[\w{1,2}\](.*?)\[\/\w{1,2}]

The thing is, sometimes the tags are nested. 问题是,有时标签会嵌套。 For example: 例如:

[b][i]nested tags content[/i][/b]

Nesting doesn't get more complicated than this. 嵌套不会比这更复杂。 As would be expected, the pattern returns: 可以预期,该模式返回:

[b][i]nested tags content[/i]

What modification should I make in the pattern or what other pattern should I use for the match to capture the entire nested set? 我应该在模式中进行什么修改,或者为匹配以捕获整个嵌套集使用其他什么模式?

Regular expression don't do very well with the conditions you set. 正则表达式不能很好地适应您设置的条件。 Especially when you have both nested expressions and multiple occurrences per string make it hard for a regular expression to parse. 尤其是当您同时具有嵌套表达式和每个字符串多次出现时,正则表达式很难解析。

It might be quite heavy to go that way, but a parser like ANTLR is better suited for this. 这样做可能会很繁重,但是像ANTLR这样的解析器更适合于此。 And if you are capable, you can write you own simple string parser. 如果有能力,您可以编写自己的简单字符串解析器。

just remove the question mark and get first group would be what you expected. 只需删除问号并获得第一组即可。 *? *? Quantifier — Matches as few times as possible, expanding as needed。 But what you need is as many times as possible as the default acting. 量词 -匹配次数尽可能少,根据需要扩展。但是您需要的是默认行为尽可能多的次数。 \\[\\w{1,2}\\](.*)\\[\\/\\w{1,2}]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM