正则表达式在PHP中查找字符串模式

Question

Suppose I have a string that looks like: 假设我有一个看起来像这样的字符串：

"lets refer to [[merp] [that entry called merp]] and maybe also to that entry called [[blue] [blue]]"

The idea here is to replace a block of [[name][some text]] with <a href="name.html">some text</a> . 这里的想法是用<a href="name.html">some text</a>替换[[name][some text]]的块。

So I'm trying to use regular expressions to find blocks that look like [[name][some text]] , but I'm having tremendous difficulty. 因此，我试图使用正则表达式来查找类似于[[name][some text]] ，但是我遇到了很大的困难。

Here's what I thought should work (in PHP): preg_match_all('/\\[\\[.*\\]\\[.*\\]/', $my_big_string, $matches) 这是我认为应该工作的（在PHP中）： preg_match_all('/\\[\\[.*\\]\\[.*\\]/', $my_big_string, $matches)

But this just returns a single match, the string from '[[merp' to 'blue]]' . 但这仅返回一个匹配项，即从'[[merp'到'blue]]'的字符串。 How can I get it to return the two matches [[merp][that entry called merp]] and [[blue][blue]] ? 如何获得返回两个匹配项[[merp][that entry called merp]]和[[blue][blue]] ？

Answer 1

The regex you're looking for is \\[\\[(.+?)\\]\\s\\[(.+?)\\]\\] and replace it with <a href="$1">$2</a> 您要查找的正则表达式为\\[\\[(.+?)\\]\\s\\[(.+?)\\]\\]并替换为<a href="$1">$2</a>

The regex pattern matched inside the () braces are captured and can be back-referenced using $1, $2,... 捕获在()大括号内匹配的正则表达式模式，可以使用$ 1，$ 2，...向后引用。

Example on regex101.com regex101.com上的示例

Answer 2

Quantifiers like the * are by default greedy , 像*这样的量词默认为贪婪，

which means, that as much as possible is matched to meet conditions. 这意味着，要尽可能满足条件。 Eg in your sample a regex like \\[.*\\] would match everything from the first [ to the last ] in the string. 例如，在您的示例中，像\\[.*\\]这样的正则表达式将匹配字符串中从第一个[到最后一个]的所有内容。 To change the default behaviour and make quantifiers lazy ( ungreedy, reluctant ): 要更改默认行为，并使量词变得懒惰（不贪心，不情愿 ）：

Use the U (PCRE_UNGREEDY) modifier to make all quantifiers lazy 使用U (PCRE_UNGREEDY) 修饰符使所有量词变得懒惰
Put a ? 放一个? after a specific quantifier. 在特定的量词之后。 Eg .*? 例如.*? as few of any characters as possible 尽可能少的字符

1.) Using the U- modifier a pattern could look like: 1.）使用U- 修饰符，模式如下所示：

/\[\[(.*)]\s*\[(.*)]]/Us

Additional used the s (PCRE_DOTALL) modifier to make the . Additional使用s（PCRE_DOTALL）修饰符制作. dot also match newlines. 点也匹配换行符。 And added some \\s whitespaces in between ][ which are in your sample string. 并在示例字符串中的][之间添加了一些\\s空格。 \\s is a shorthand for [ \\t\\r\\n\\f] . \\s是[ \\t\\r\\n\\f]的简写。

There are two capturing groups (.*) to be replaced then. 然后有两个捕获组(.*)要替换。 Test on regex101.com 在regex101.com上测试

2.) Instead using the ? 2）代替使用? to making each quantifier lazy: 使每个量词变得懒惰：

/\[\[(.*?)]\s*\[(.*?)]]/s

Test on regex101.com 在regex101.com上测试

3.) Alternative without modifiers, if no square brackets are expected to be inside [...] . 3.）如果没有方括号，则建议不带修饰符的[...] 。

/\[\[([^]]*)]\s*\[([^]]*)]]/

Using a ^ negated character class to allow [^]]* any amount of characters, that are NOT ] in between [ and ] . 使用^否定字符类，以允许[^]]*任何字符量，未]在间[和] 。 This wouldn't require to rely on greediness. 这不需要依靠贪婪。 Also no . 也没有. is used, so no s-modifier is needed. 使用，因此不需要s-修饰符。

Test on regex101.com 在regex101.com上测试

Replacement for all 3 examples according to your sample: <a href="\\1">\\2</a> where \\1 correspond matches of the first parenthesized group ,... 根据您的示例替换所有3个示例： <a href="\\1">\\2</a>其中\\1对应第一个括号组的匹配项，...

正则表达式在PHP中查找字符串模式

问题描述

2 个解决方案

解决方案1
4 2014-04-06 07:29:10

解决方案2
2 已采纳 2014-04-06 12:05:35

正则表达式在PHP中查找字符串模式

问题描述

2 个解决方案

解决方案1 4 2014-04-06 07:29:10

解决方案2 2 已采纳 2014-04-06 12:05:35

解决方案1
4 2014-04-06 07:29:10

解决方案2
2 已采纳 2014-04-06 12:05:35