简体   繁体   English

当两个字符串之间的字符串时,正则表达式不匹配

[英]regex to not match when string between two strings

I have the following parsing scenario in python, there is cases of lines:我在python中有以下解析场景,有行的情况:

  1. {{ name xxxxxxCONTENTxxxxx /}}
  2. {{ name }} xxxxxxxCONTENTxxxxxxx {{ name /}}
  3. {{ name xxxxxxCONTENTxxx {comand} xxxxCONTENTxxx /}}

All I need to do is classify to which case the given line belongs using regex.我需要做的就是使用正则表达式对给定的行属于哪种情况进行分类。

I can successfully classify between 1) and 2) but having trouble to deal with 3).我可以成功地在 1) 和 2) 之间进行分类,但在处理 3) 时遇到麻烦。

to catch 1) I use:捕捉 1) 我使用:

re.match('\s*{{[^{]*?/}}\s*',line)

to catch 2) I use:抓住 2) 我使用:

re.match('{{.*?}}',line)

and then raise a flag to keep the context since case 2) can be over multiple lines.然后提升一个标志以保留上下文,因为情况 2) 可以跨多行。 How can I catch case 3) ??我怎样才能抓住案例 3) ??

The condition which I'm currently trying to match is to test for:我目前尝试匹配的条件是测试:

- start with '{{'
- end with '/}}'
- with no '{{' in between

However I'm having a hard time phrasing this in regex.但是,我很难在正则表达式中表达这一点。

The conditions:条件:

- start with '{{'
- end with '/}}'
- with no '{{' in between

are a perfect fit for a tempered greedy token .非常适合脾气暴躁的贪婪令牌

^{{(?:(?!{{|/}}).)*/}}$
   ^^^^^^^^^^^^^^^^

See regex demo .请参阅正则表达式演示

The (?:(?!{{|/}}).)* matches any text that is not {{ and /}} (thus matches up to the first /}} ). (?:(?!{{|/}}).)*匹配任何不是{{/}}文本(因此匹配到第一个/}} )。 Anchors ( ^ and $ ) allow to only match a whole string that starts with {{ and ends with /}} and has no {{ inside.锚点( ^$ )只允许匹配以{{开头并以/}}结尾且内部没有{{的整个字符串。 Note that with re.match , you do not neet ^ anchor.请注意,使用re.match ,您不需要^锚点。

Now, to only match the 3rd type of strings , you need to specify that your pattern should have {....} :现在,要仅匹配第三种类型的 strings ,您需要指定您的模式应具有{....}

^{{(?:(?!{{|/}}).)*{[^{}]*}(?:(?!{{|/}}).)*/}}$
   | ----  1 -----|| - 2 -||--------1-----|

See another regex demo查看另一个正则表达式演示

Part 1 is the tempered greedy token described above and {[^{}]*} matches a single {...} substring making it compulsory inside the input.第 1 部分是上面描述的缓和的贪婪标记, {[^{}]*}匹配单个{...}子字符串,使其在输入中成为强制性的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM