简体   繁体   English

正则表达式在块之间找到模式

[英]Regex find pattern between blocks

I'm building a templating engine in PHP (Django like) that replaces everything between {{ }} with its related data. 我正在用PHP(类似于Django)构建一个模板引擎,以其相关数据替换{{ }}之间的所有内容。 Right now I'm able to do that, but I'm facing a situation that requires a replacement only between blocks, such as {% for y in x %} loop blocks and ignores all brackets that are not in between them. 现在,我能够做到这一点,但是我面临的情况是,仅需要在块之间进行替换,例如{% for y in x %}循环块,并且忽略了不在它们之间的所有括号。

I was able to somewhat get some results in this regex101 example but only getting the first {{ }} of each block. 这个regex101示例中,我可以得到一些结果,但是只能得到每个块的第一个{{ }} What I want to do is to match all {{ }} in each block, excluding the ones that are outside. 我想做的是匹配每个块中的所有{{ }} ,不包括外部的。

For learning purposes (very good!) you have several possibilities: 出于学习目的(非常好!),您有几种可能:

  1. A multi-step approach (easier to comprehend and to maintain): 多步骤方法(易于理解和维护):

  2. An overall regex solution (more complicated & possibly more "fancy") 整体正则表达式解决方案(更复杂,甚至可能更“花哨”)


Ad 1) 广告1)

Match the blocks with the following expression (see a demo on regex101.com ): 将块与以下表达式匹配(请参阅regex101.com上的演示 ):

{{\s*(.+?)\s*}}

And look for pairs of {{...}} in each block with: 并在每个块中查找成对的{{...}}其中包括:

<?php
$data = <<<DATA
{% for user in users %}
   Hello, {{ user.name }}, you are {{ user.age }} {{ user.name }}
ssssssssssssssssssssss {{ user.name }}
sdsddddddddddddddddddddddddddddd
{% endfor %}

{% for dog in dogs %}
   Your dog is {{ dog.age }} and likes {{ dog.food }}.
{% endfor %}
wwww
{{ user.name }}
DATA;

$block = '~
            {%\ for.*?%}
            (?s:.+?)
            {%\ endfor.*?%}
            ~x';

$variable = '~{{\s*(.+?)\s*}}~';

if (preg_match_all($block, $data, $matches)) {
    foreach ($matches as $match) {
        if (preg_match_all($variable, $match[0], $variables, PREG_SET_ORDER)) {
            print_r($variables);
        }

    }
}
?>

In PHP , this could be: PHP ,这可能是:

 <?php $data = <<<DATA {% for user in users %} Hello, {{ user.name }}, you are {{ user.age }} {{ user.name }} ssssssssssssssssssssss {{ user.name }} sdsddddddddddddddddddddddddddddd {% endfor %} {% for dog in dogs %} Your dog is {{ dog.age }} and likes {{ dog.food }}. {% endfor %} wwww {{ user.name }} DATA; $block = '~ {%\\ for.*?%} (?s:.+?) {%\\ endfor.*?%} ~x'; $variable = '~{{\\s*(.+?)\\s*}}~'; if (preg_match_all($block, $data, $matches)) { foreach ($matches as $match) { if (preg_match_all($variable, $match[0], $variables, PREG_SET_ORDER)) { print_r($variables); } } } ?> 


Ad 2) 广告2)

Match all of the variables in question with an overall expression. 将所有有问题的变量与一个整体表达式匹配。 Here, you'll need \\G (which matches at the position of the last match) and some lookaheads (see a demo for this one at regex101.com as well ): 在这里,您将需要\\G (在最后一场比赛的位置匹配)和一些先行动作(也可以在regex101.com上查看此示例的演示 ):

 (?:{%\\ for.+?%} | \\G(?!\\A) ) (?s:(?!{%).)*?\\K {{\\s*(?P<variable>.+?)\\s*}} 

Now let's demystify this expression: 现在让我们揭开这个表达式的神秘面纱:

 (?:{%\\ for.+?%} | \\G(?!\\A) ) 

Here, we want to either match {%\\ for.+?%} (we need the \\ as we are in verbose mode) or at the position of the last match with \\G . 在这里,我们要匹配{%\\ for.+?%} (我们需要使用\\因为我们处于详细模式), 或者要匹配最后一个与\\G匹配的位置。 Now, the truth is, \\G either matches at the position of the last match or the very beginning of the string. 现在,事实是\\G在最后匹配的位置或字符串的最开始匹配。 We do not want the latter, hence the neg. 我们不希望后者,因此不想要。 lookahead (?!\\A) . 前瞻(?!\\A)

The next part 下一部分

 (?s:(?!{%).)*?\\K 

kind of does a "fast forward" to the interesting parts in question. 一种“快速前进”到所关注的有趣部分。

Broken down, this says 坏了,这说

 (?s: # open a non-capturing group, enabling the DOTALL mode (?!{%). # neg. lookahead, do not overrun {% (the closing tag) )*? # lazy quantifier for the non-capturing group \\K # make the engine "forget" everything to the left 

Now, the rest is easy: 现在,剩下的事情很简单了:

 {{\\s*(?P<variable>.+?)\\s*}} 

It's basically, the same construct as for ad 1). 基本上,其结构与广告1)相同。

Again, in PHP , this could be: 同样,在PHP ,这可能是:

 <?php $regex = '~ (?:{%\\ for.+?%} | \\G(?!\\A) ) (?s:(?!{%).)*?\\K {{\\s*(?P<variable>.+?)\\s*}} ~x'; if (preg_match_all($regex, $data, $variables)) { print_r($variables[1]); } ?> 


With all that said, it's generally a good idea to actually learn more complex patterns but not to reinvent the wheel on the other hand - there's always someone smarter than you & me who has probably taken into account several edge cases, etc. 综上所述,实际上学习更复杂的模式通常是个好主意,但另一方面,不要重蹈覆辙-总有一个比您和我更聪明的人,并且可能考虑到了一些极端情况等。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM