简体   繁体   English

正则表达式在括号内抓取所有文本,而不是引号

[英]Regex grab all text between brackets, and NOT in quotes

I'm attempting to match all text between {brackets}, however not if it is in quotation marks: For example: 我正在尝试匹配{bracket}之间的所有文本,但是如果它在引号中则不匹配:例如:

$str = 'value that I {want}, vs value "I do {NOT} want" '

my results should snatch "want", but omit "NOT". 我的结果应该抢夺“想要”,但省略“不”。 I've searched stackoverflow desperately for the regex that could perform this operation with no luck. 我已经拼命搜索了stackoverflow的正则表达式,可以执行此操作,没有运气。 I've seen answers that allow me to get the text between quotes but not outside quotes and in brackets. 我已经看到了答案,允许我在引号之间但不在引号之间和括号中得到文本。 Is this even possible? 这甚至可能吗?

And if so how is it done? 如果是这样,它是如何完成的?

So far this is what I have: 到目前为止,这就是我所拥有的:

preg_match_all('/{([^}]*)}/', $str, $matches);

But unfortunately it only gets all text inside brackets, including {NOT} 但遗憾的是,它只会将所有文本放在括号内,包括{NOT}

It's quite tricky to get this done in one go. 一次性完成这项工作非常棘手。 I even wanted to make it compatible with nested brackets so let's also use a recursive pattern : 我甚至想让它与嵌套括号兼容,所以让我们也使用递归模式

("|').*?\1(*SKIP)(*FAIL)|\{(?:[^{}]|(?R))*\}

Ok, let's explain this mysterious regex : 好吧,让我们解释一下这个神秘的正则表达式:

("|')                   # match eiter a single quote or a double and put it in group 1
.*?                     # match anything ungreedy until ...
\1                      # match what was matched in group 1
(*SKIP)(*FAIL)          # make it skip this match since it's a quoted set of characters
|                       # or
\{(?:[^{}]|(?R))*\}     # match a pair of brackets (even if they are nested)

Online demo 在线演示

Some php code: 一些PHP代码:

$input = <<<INP
value that I {want}, vs value "I do {NOT} want".
Let's make it {nested {this {time}}}
And yes, it's even "{bullet-{proof}}" :)
INP;

preg_match_all('~("|\').*?\1(*SKIP)(*FAIL)|\{(?:[^{}]|(?R))*\}~', $input, $m);

print_r($m[0]);

Sample output: 样本输出:

Array
(
    [0] => {want}
    [1] => {nested {this {time}}}
)

Personally I'd process this in two passes. 就个人而言,我会在两次通过中处理此事。 The first to strip out everything in between double quotes, the second to pull out the text you want. 第一个删除双引号之间的所有内容,第二个删除你想要的文本。

Something like this perhaps: 也许这样的东西:

$str = 'value that I {want}, vs value "I do {NOT} want" ';

// Get rid of everything in between double quotes
$str = preg_replace("/\".*\"/U","",$str);

// Now I can safely grab any text between curly brackets
preg_match_all("/\{(.*)\}/U",$str,$matches);

Working example here: http://3v4l.org/SRnva 这里的工作示例:http: //3v4l.org/SRnva

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM