简体   繁体   English

正则表达式 - 如何匹配所有不在引号内的括号

[英]Regex - How to match all brackets that are not inside quotes

I have the following String:我有以下字符串:

"Hello, don't match this: ,{SUM(1:2)}" but match this, :{SUM(1,2)} "And not this: !{MEAN(3,4)}" “你好,不要匹配这个:{SUM(1:2)}”但是匹配这个:{SUM(1,2)}“而不是这个:!{MEAN(3,4)}”

I currently have the following regex (\?{(:?\[??[^\[]*?\})) to match all !{}我目前有以下正则表达式(\?{(:?\[??[^\[]*?\}))来匹配所有 !{}

How could I add to this to match all.{?} that are NOT inside a pair of double quotes?我该如何添加以匹配所有不在一对双引号内的。{?}?

Not sure if this is even possible since the middle section would technically be between 2 double quotes..不确定这是否可能,因为中间部分在技术上会在 2 个双引号之间。

Thanks for any help!!谢谢你的帮助!!

https://regex101.com/r/QAmbym/1 https://regex101.com/r/QAmbym/1

^(?>".*"|[^.{}])*\K!{.*}

^ - Anchor to the start of the string. ^ - 锚定到字符串的开头。 This is important because of what's next.这很重要,因为接下来会发生什么。

(?> - atomic group. once matched, don't backtrack. (?> - 原子组。一旦匹配,就不要回溯。

".*" - Consume anything in quotes. ".*" - 使用引号中的任何内容。 Once consumed, the engine won't backtrack for it一旦消耗,引擎将不会为其回溯

| - or - 或者

[^!{}] - match anything that isn't the characters we're interested in. These are still part of the atomic group, so anything consumed can't be matched. [^!{}] - 匹配任何我们不感兴趣的字符。它们仍然是原子组的一部分,因此无法匹配任何消耗的字符。

)* - Close the atomic group and match the entire group 0 or more times. )* - 关闭原子组并匹配整个组 0 次或更多次。

\K - reset the pattern. \K - 重置模式。 Throw away everything matched up until this point.扔掉所有匹配到这一点的东西。

.{.*} - What you're looking for. .{.*} - 你在找什么。

So because we're starting out anchored, we're going to consume anything that is in quotes, or definitely isn't what we want.因此,因为我们开始锚定,所以我们将使用引号中的任何内容,或者绝对不是我们想要的内容。 It's consumed in a way that won't let the engine backtrack for it.它以一种不会让引擎回溯的方式消耗。 Then once we get to what we're looking for, we throw everything away.然后,一旦我们找到了我们要找的东西,我们就把所有东西都扔掉了。 This means anything in quotes won't be matched.这意味着引号中的任何内容都不会匹配。

Edit:编辑:

Thanks to @Nick posting the link to the other thread, I've found there's a much simpler way to go about it that works with javascript:感谢@Nick 将链接发布到另一个线程,我发现有一种简单的方法可以与 javascript 一起使用 go:

https://regex101.com/r/Y3QHTc/1 https://regex101.com/r/Y3QHTc/1

".*"|(.{.*})

".*" match anything in quotes, ungrouped ".*"匹配引号中的任何内容,未分组

| or或者

(.{.*}) match what you're looking for in group 1. (.{.*})匹配您在第 1 组中查找的内容。

While technically you're matching unwanted strings, you can filter out everything that doesn't return a group 1, leaving you with what you're looking for.虽然从技术上讲您匹配不需要的字符串,但您可以过滤掉所有不返回第 1 组的内容,从而留下您要查找的内容。

Since atomic groups are not supported in JS, you can try this simple method:由于JS不支持原子组,你可以试试这个简单的方法:

  1. Match everything that is inside qoutes and remove it匹配 qoutes 中的所有内容并将其删除
  2. Check if the remaining string has the things that you want to match.{.+}检查剩下的字符串是否有你想要匹配的东西。{.+}

 let s = `"Hello, don't match this: ,{SUM(1:2)}" but match this, :{SUM(1,2)} "And not this. .{MEAN(3.4)}"` let pattern = /.{?+}/gm if (pattern,test(s.replaceAll(/".+?"/gm, ''))) { console.log("Match") } else { console.log("Not match") }

Here s.replaceAll(/".+?"/gm, '') gives you: but match this: ,{SUM(1,2)}这里s.replaceAll(/".+?"/gm, '')给你: but match this: ,{SUM(1,2)}
then you apply the pattern .{.+} to match everything that is between !{ and } , on it.然后应用模式.{.+}来匹配!{}之间的所有内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM