简体   繁体   English

正则表达式-匹配某些分隔符之间的任何文本

[英]Regex - match any text between some delimiters

I try to catch this string [[....]] (including brackets) 我尝试捕获此字符串[[....]] (包括方括号)

where .... can be anything (including non-printable) except ]] ....可以是除]]以外的任何内容(包括不可打印的内容)

Here is the source where to match : 这是匹配的来源:

var myString = 'blablablabla[["<strong>LA DEFENSE 4 TEMPS ( La Rotonde )</strong><br />Centre commercial LES 4 TEMPS",
                         48.89141725,
                         2.23478235,
                         "4T"],
    ["<strong>ANGERS</strong><br />Centre commercial GEANT",
                         48.89141725,
                         2.23478235,
                         "4T"]]blablablabla'

I try to use this method [^\\]]+ to match all chars/non-chars except double bracket. 我尝试使用此方法[^\\]]+匹配除双括号之外的所有字符/非字符。 The problem i have is that i do not know how to use this method with a bracket that is immediatly after the first bracket [^\\]\\]]+ . 我的问题是我不知道如何在第一个括号[^\\]\\]]+之后立即使用此方法。

Is there a solution with positive/negative lookahead or word boundary ? 是否存在正/负前瞻或单词边界的解决方案?

(\[\[[^\](?=\])]+)

正则表达式可视化

Debuggex Demo Debuggex演示

Any help please ? 有什么帮助吗?

In JavaScript, to match any text between some delimiters that consist of more than one character is best achieved with the [^] / [\\s\\S] / [\\d\\D] / [\\w\\W] construct with a lazy quantifier ( *? matching 0 or more occurrences, or +? matching 1 or more occurrences of the preceding subpattern, but as few as possible to return a valid match). 在JavaScript中,最好使用[^] / [\\s\\S] / [\\d\\D] / [\\w\\W]构造(在一个带有多个字符的定界符之间匹配任何文本)来实现最佳匹配量词( *?匹配0次或多次出现,或+?匹配1次或多次发生在前子模式,但返回有效匹配的次数越少越好)。

While [^] construct matching any character including a newline is JavaScript specific, [\\s\\S] and its variants are mostly cross-platform constructs that will work in PCRE, .NET, Python, Java, etc. The [...] in this case is a character class that contains two opposite shorthand classes. 尽管与任何字符(包括换行符)匹配的[^]构造都是JavaScript特定的,但是[\\s\\S]及其变体大部分是跨平台构造,可在PCRE,.NET,Python,Java等环境中使用[...]在这种情况下, [...]是一个字符类,其中包含两个相反的速记类。 Since \\s matches all whitespace characters and \\S matches all non-whitespace characters, this [\\s\\S] matches any symbol there is in any input. 由于\\s匹配所有空白字符,而\\S匹配所有非空白字符,因此此[\\s\\S]匹配任何输入中存在的任何符号。

I'd recommend to avoid using (.|\\n) . 我建议避免使用(.|\\n) This construct causes more backtracking steps to occur and slows regex search down. 这种构造会导致发生更多的回溯步骤,并减慢正则表达式的搜索速度。

So, you can use 因此,您可以使用

\[\[[\d\D]*?]]

See JS regex demo 参见JS正则表达式演示

Here is a code snippet: 这是一个代码片段:

 var re = /\\[\\[[\\d\\D]*?]]/g; var str = 'blablablabla[["<strong>LA DEFENSE 4 TEMPS ( La Rotonde )</strong><br />Centre commercial LES 4 TEMPS",\\n 48.89141725,\\n 2.23478235,\\n "4T"],\\n ["<strong>ANGERS</strong><br />Centre commercial GEANT",\\n 48.89141725,\\n 2.23478235,\\n "4T"]]blablablabla'; var m; while ((m = re.exec(str)) !== null) { console.log(m[0]); } 

UPDATE UPDATE

In this case, when the delimiters are different and consist of just 2 characters, you can use a technique of matching all characters other than the first symbol of the closing delimiter and then 0 or more sequences of the whole closing delimiter followed by 1 or more occurrences of any symbol other than the first symbol in the closing delimiter. 在这种情况下,当定界符不同并且仅由2个字符组成时,可以使用一种技术来匹配所有字符,而不是闭合定界符的第一个符号,然后匹配整个闭合定界符的0个或多个序列,然后是1个或多个除定界符中的第一个符号以外的任何符号都出现。

\[\[[^\]]*(?:][^\]]+)*]]

See regex demo 正则表达式演示

The linear character of this regex makes it really fast. 此正则表达式的线性特征使其速度非常快。

PS I also want to note that you do not need to escape the ] outside of character class in JS regex, but it must be escaped inside a character class - always. PS我也要注意,您不需要在JS正则表达式中的字符类外部转义] ,但必须在字符类内部转义-始终。

Try this: 尝试这个:

\[\[(.|\n)*?\]\]

https://regex101.com/r/gR5oJ3/1 https://regex101.com/r/gR5oJ3/1

It should match anything between and including [[ ]] . 它应该匹配[[ ]]之间的任何内容。 The main issue was dealing with newlines , and the (.|\\n) part will match anything including newlines . 主要问题是处理换行符(.|\\n)部分将匹配包括换行符在内的所有内容

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM