RegExp练习：带有先行断言的不情愿量词

Question

Can you explain me how this works? 你能解释一下这是怎么回事吗？ Here is an example: 这是一个例子：

<!-- The quick brown fox 
              jumps over the lazy dog -->

<!--[if IE 7]>
    <link rel="stylesheet" type="text/css" href="/supersheet.css" />
<![endif]-->

<!-- Pack my box with five dozen liquor jugs -->

First, I tried to use the following regular expression to match the content inside conditional comments: 首先，我尝试使用以下正则表达式来匹配条件注释中的内容：

/<!--.*?stylesheet.*?-->/s

It failed, as the regular expression matches all the content before the first  . 它失败了，因为正则表达式匹配第一个之前的所有内容。 Then I tried using another pattern with a lookahead assertion: 然后我尝试使用另一种模式与前瞻断言：

/<!--(?=.*?stylesheet).*?-->/s

It works and matches exactly what I need. 它的工作原理与我需要的完全匹配。 However, the following regular expression works as well: 但是，以下正则表达式也起作用：

/<!--(?=.*stylesheet).*?-->/s

The last regular expression does not have a reluctant quantifier in the lookahead assertion. 最后一个正则表达式在前瞻断言中没有一个不情愿的量词。 And now I am confused. 现在我很困惑。 Can anyone explain me how it works? 谁能解释一下它是如何工作的？ Maybe there is a better solution for this example? 也许这个例子有更好的解决方案？

Updated: 更新：

I tried usig the regular expressions with lookahead assertion in another document, and it failed to mach the content between the comments. 我尝试在另一个文档中使用lookahead断言来使用正则表达式，并且它无法在注释之间添加内容。 So, this one //s (as well as this one //s ) is not correct. 所以，这个/ / //s （？=。*？ //s （以及这一个/ //s (? //s ）不正确。 Do not use it and try other suggestions. 不要使用它并尝试其他建议。

Updated: 更新：

The solution has been found by Jonny 5 (see the answer). Jonny 5找到了解决方案（见答案）。 He suggested three options: 他提出了三种选择：

Using of a negated hyphen to limit match. 使用否定连字符来限制匹配。 This option works only if there is no a hyphen between the tags. 仅当标记之间没有连字符时，此选项才有效。 If a stylesheet has an URL /style-sheet.css , it will not work. 如果样式表具有URL /style-sheet.css ，则它将不起作用。
Using of escape sequence: \\K . 使用转义序列： \\K It works like a charm. 它就像一个魅力。 The downsides are the following: 缺点如下：
- It is terribly slow (in my case, it was 8-10 times slower than the other solutions) 它非常慢（在我的情况下，它比其他解决方案慢8-10倍）
- Only available since PHP 5.2.4 仅适用于PHP 5.2.4
Using a lookahead to narrow the match. 使用前瞻来缩小比赛范围。 This is the goal I tried to achieve, but my expirience of using lookaround assertions was insufficient to perform the task. 这是我试图实现的目标，但是我使用外观断言的经验不足以执行任务。

I think the following is a good solution for my example: 我认为以下是我的例子的一个很好的解决方案：

/(?s)<!--(?:(?!<!).)+?stylesheet.+?-->/

The same but with the s modifier at the end: 相同但最后使用s修饰符：

/<!--(?:(?!<!).)+?stylesheet.+?-->/s

As I said, this is a good solution, but I managed to improve the pattern and found another one that in my case works faster. 正如我所说，这是一个很好的解决方案，但我设法改进了模式，并找到了另一个在我的情况下工作得更快的模式。

So, the final solution is the following: 所以，最终的解决方案如下：

/<!--(?:(?!-->).)+?stylesheet.+?-->/s

Thanks all the participants for interesting answers. 感谢所有参与者的有趣答案。

Answer 1

The string stylesheet is mentioned only one time in your test document, so both regular expressions you tried will match the same thing but in different ways. 字符串stylesheet在测试文档中只提到一次，因此您尝试的两个正则表达式将以不同的方式匹配相同的内容。

<!--(?=.*?stylesheet).*?-->/s

This one does the following: 这个做了以下几点：

Capture <!-- . 捕获<!-- 。
Look ahead, capturing characters up to and including stylesheet . 展望未来，捕捉角色，包括stylesheet 。 Fail if not found. 如果找不到则失败。
Capture characters up to and including --> . 捕获角色，包括--> 。

<!--(?=.*stylesheet).*?-->/s

This one does the following: 这个做了以下几点：

Capture <!-- . 捕获<!-- 。
Look ahead, capturing any character until no longer possible. 向前看，捕捉任何角色直到不再可能。 Backtrack, continuously trying to match stylesheet . Backtrack，不断尝试匹配stylesheet 。 Fail if not found. 如果找不到则失败。
Capture characters up to and including --> . 捕获角色，包括--> 。

Basically, one needs to backtrack significantly while the other doesn't. 基本上，一个人需要显着地回溯，而另一个人则不需要。

If your subject instead is... 如果您的主题是......

<!-- The quick brown fox 
              jumps over the lazy dog -->

<!--[if IE 7]>
    <link rel="stylesheet" type="text/css" href="/supersheet.css" /> <![endif]-->

<!-- Pack my box with five dozen stylesheets -->

you get two different results. 你得到两个不同的结果。 The former would find the first stylesheet , while the latter would find the second (and last) since it starts searching from the end of the string. 前者会找到第一个stylesheet ，而后者会找到第二个（和最后一个），因为它从字符串的末尾开始搜索。

Answer 2

To match only the part  there are many ways: 要仅匹配 ，有很多方法：

1.) Use a negated hyphen [^-] to limit the match and stay in between <!-- and stylesheet 1.）使用否定连字符[^-]来限制匹配并保持在<!--和stylesheet

(?s)<!--[^-]+stylesheet.+?-->

[^-] allows only characters, that are not a hyphen. [^-]仅允许不是连字符的字符。 See test at regex101 . 请参阅regex101上的测试。

2.) To get the "last" or closest match without much regex effort, also can put a greedy dot before to ᗧ eat up. 2.）要获得“最后”或最接近的匹配而没有太多正则表达式的努力，也可以在ᗧ吃之前放一个贪婪的点。 Makes sense if not matching globally / only one item to match. 如果不匹配全局/只匹配一个项目，则有意义。 Use \\K to reset after the greed: 使用\\ K在贪婪后重置：

(?s)^.*\K<!--.+?stylesheet.+?-->

See test at regex101 . 请参阅regex101上的测试。 Also can use a capture group and grab $1: (?s)^.*() 也可以使用捕获组并获取$ 1：（ (?s)^.*()

3.) Using a lookahead to narrow it down is usually more costly: 3.）使用前瞻来缩小范围通常更昂贵：

(?s)<!--(?:(?!<!).)+?stylesheet.+?-->

See test at regex101 . 请参阅regex101上的测试。 (?!<!). looks ahead at each character in between <!-- and stylesheet if not starting another <! 展望<!--和stylesheet中的每个角色，如果没有开始另一个<! ... to stay inside one element. ......留在一个元素里面 Similar to the negated hyphen solution. 类似于否定的连字符解决方案。

Instead of .* I used .+ for one or more - depends on what to be matched. 而不是.*我使用.+ 一个或多个 - 取决于匹配什么。 Here + fits better. 这里+更合适。
What solution to use depends on the exact requirements. 使用什么解决方案取决于具体要求。 For this case I would use the first. 对于这种情况，我会使用第一个。

RegExp练习：带有先行断言的不情愿量词

问题描述

2 个解决方案

解决方案1
2 2015-08-16 01:53:57

解决方案2
2 已采纳 2015-08-16 08:01:01

RegExp练习：带有先行断言的不情愿量词

问题描述

2 个解决方案

解决方案1 2 2015-08-16 01:53:57

解决方案2 2 已采纳 2015-08-16 08:01:01

解决方案1
2 2015-08-16 01:53:57

解决方案2
2 已采纳 2015-08-16 08:01:01