简体   繁体   English

Python-在{}个字符之间匹配字符串,但在{{}}之间不匹配

[英]Python - Match string between { } characters, but not between {{ }}

I'm trying to match some variable names in a html document to populate a dictionary. 我正在尝试匹配html文档中的一些变量名以填充字典。 I have the html 我有HTML

<div class="no_float">
    <b>{node_A_test00:02d}</b>{{css}}
    <br />
    Block mask: {block_mask_lower_node_A} to {block_mask_upper_node_A}
    <br />
</div>
<div class="sw_sel_container">
    Switch selections: 
    <table class="sw_sel">
        <tr>
            <td class="{sw_sel_node_A_03}">1</td>
            <td class="{sw_sel_node_A_03}">2</td>
            <td class="{sw_sel_node_A_03}">3</td>
            <td class="{sw_sel_node_A_04}">4</td>
            <td class="{sw_sel_node_A_05}">5</td>

I want to match code between { and ( } or : ). 我想在{和(}或:)之间匹配代码。 But if it starts with {{ I don't want to match it at all (I will be using this for inline css} 但是,如果它以{{开头,我根本不想匹配(我将在嵌入式CSS中使用它)

so far I have the regex expression 到目前为止,我有正则表达式

(?<=\{)((?!{).*?)(?=\}|:)

but this is still matching text inside {{css}}. 但这仍然与{{css}}中的文本匹配。

You could do something like this: 您可以执行以下操作:

re.findall(r'''
    (?<!\{)    # No opening bracket before
    \{         # Opening bracket
      ([^}]+)  # Stuff inside brackets
    \}         # Closing bracket
    (?!\})     # No closing bracket after
''', '{foo} {{bar}} {foo}', flags=re.VERBOSE)

This seems to be working: 这似乎正在工作:

(?<=(?<!{){)[^{}:]+

and this with a capture: 这与捕获:

(?<!{){([^{}:]+)

I see that you've already found a solution that works, but I thought it might be worthwhile to explain what the problem with your original regex is. 我发现您已经找到了可行的解决方案,但是我认为可能值得解释一下原始正则表达式的问题所在。

  • (?<=\\{) means that a { must precede whatever matches next. (?<=\\{)表示{必须位于下一个匹配项之前。 Fair enough. 很公平。
  • ((?!{).*?) will match anything that starts with a character other than { . ((?!{).*?)将匹配以{以外的字符开头的任何字符。 Okay, so we're only matching things inside the braces. 好的,我们只匹配括号的内容。 Good. 好。

But now consider what happens when you have two opening braces: {{bar}} . 但是,现在考虑一下,当您有两个括号时,会发生什么: {{bar}} Consider the substring bar . 考虑子串bar What precedes the b ? b之前是什么? A { . A { Does bar start with { ? bar是否以{开头? Nope. 不。 So the regex will consider this a match. 因此,正则表达式将认为这是一个匹配项。

You have, of course, prevented the regex from matching {bar} , which is what it would do if you left the (?!{) out of your pattern, because {bar} starts with a { . 当然,您已经阻止了正则表达式匹配{bar} ,如果您将(?!{)排除在模式之外,则正则表达式将与之匹配,因为{bar}{开头。 But as soon as the regex engine determines that no valid match starts on the { character, it moves on to the next character-- b --and sees that a match starts there. 但是,只要正则表达式引擎确定{字符上没有有效的匹配开始,它就会移到下一个字符b看到匹配从那里开始。

Now, just for kicks, here's the regex I'd use: 现在,只为踢球,这是我要使用的正则表达式:

(?!<={){([^{}:]+)[}:](?!=})

  • (?!<{) : the match shouldn't be preceded by { . (?!<{) :比赛之前不应加上{
  • { : the match starts with an open brace. { :比赛以大括号开头。
  • ([^{}:]+) : group everything that isn't an open-brace, close-brace, or colon. ([^{}:]+) 一切不是开放式支架,特写撑,或结肠。 This is the part of the match that we actually want. 这是我们真正想要的比赛的一部分。
  • [}:] : end the match with a close-brace or colon. [}:] :以大括号或冒号结束比赛。
  • (?!}) : the match shouldn't be followed by } . (?!}) :比赛之后不应加上}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM