[英]Regular expression to find pattern in string in PHP
Suppose I have a string that looks like: 假设我有一个看起来像这样的字符串:
"lets refer to [[merp] [that entry called merp]] and maybe also to that entry called [[blue] [blue]]"
The idea here is to replace a block of [[name][some text]]
with <a href="name.html">some text</a>
. 这里的想法是用
<a href="name.html">some text</a>
替换[[name][some text]]
的块。
So I'm trying to use regular expressions to find blocks that look like [[name][some text]]
, but I'm having tremendous difficulty. 因此,我试图使用正则表达式来查找类似于
[[name][some text]]
,但是我遇到了很大的困难。
Here's what I thought should work (in PHP): preg_match_all('/\\[\\[.*\\]\\[.*\\]/', $my_big_string, $matches)
这是我认为应该工作的(在PHP中):
preg_match_all('/\\[\\[.*\\]\\[.*\\]/', $my_big_string, $matches)
But this just returns a single match, the string from '[[merp'
to 'blue]]'
. 但这仅返回一个匹配项,即从
'[[merp'
到'blue]]'
的字符串。 How can I get it to return the two matches [[merp][that entry called merp]]
and [[blue][blue]]
? 如何获得返回两个匹配项
[[merp][that entry called merp]]
和[[blue][blue]]
?
The regex you're looking for is \\[\\[(.+?)\\]\\s\\[(.+?)\\]\\]
and replace it with <a href="$1">$2</a>
您要查找的正则表达式为
\\[\\[(.+?)\\]\\s\\[(.+?)\\]\\]
并替换为<a href="$1">$2</a>
The regex pattern matched inside the ()
braces are captured and can be back-referenced using $1, $2,... 捕获在
()
大括号内匹配的正则表达式模式,可以使用$ 1,$ 2,...向后引用。
Example on regex101.com regex101.com上的示例
Quantifiers like the *
are by default greedy , 像
*
这样的量词默认为贪婪 ,
which means, that as much as possible is matched to meet conditions. 这意味着,要尽可能满足条件。 Eg in your sample a regex like
\\[.*\\]
would match everything from the first [
to the last ]
in the string. 例如,在您的示例中,像
\\[.*\\]
这样的正则表达式将匹配字符串中从第一个[
到最后一个]
的所有内容。 To change the default behaviour and make quantifiers lazy ( ungreedy, reluctant ): 要更改默认行为,并使量词变得懒惰 (不贪心,不情愿 ):
U (PCRE_UNGREEDY)
modifier to make all quantifiers lazy U (PCRE_UNGREEDY)
修饰符使所有量词变得懒惰 ?
?
after a specific quantifier. .*?
.*?
as few of any characters as possible 1.) Using the U- modifier a pattern could look like: 1.)使用U- 修饰符 ,模式如下所示:
/\[\[(.*)]\s*\[(.*)]]/Us
Additional used the s (PCRE_DOTALL) modifier to make the .
Additional使用s(PCRE_DOTALL) 修饰符制作
.
dot also match newlines. 点也匹配换行符。 And added some
\\s
whitespaces in between ][
which are in your sample string. 并在示例字符串中的
][
之间添加了一些\\s
空格。 \\s
is a shorthand for [ \\t\\r\\n\\f]
. \\s
是[ \\t\\r\\n\\f]
的简写 。
There are two capturing groups (.*)
to be replaced then. 然后有两个捕获组
(.*)
要替换。 Test on regex101.com 在regex101.com上测试
2.) Instead using the ?
2)代替使用
?
to making each quantifier lazy: 使每个量词变得懒惰:
/\[\[(.*?)]\s*\[(.*?)]]/s
Test on regex101.com 在regex101.com上测试
3.) Alternative without modifiers, if no square brackets are expected to be inside [...]
. 3.)如果没有方括号,则建议不带修饰符的
[...]
。
/\[\[([^]]*)]\s*\[([^]]*)]]/
Using a ^
negated character class to allow [^]]*
any amount of characters, that are NOT ]
in between [
and ]
. 使用
^
否定字符类 ,以允许[^]]*
任何字符量,未]
在间[
和]
。 This wouldn't require to rely on greediness. 这不需要依靠贪婪。 Also no
.
也没有
.
is used, so no s-modifier is needed. 使用,因此不需要s-修饰符。
Test on regex101.com 在regex101.com上测试
Replacement for all 3 examples according to your sample: <a href="\\1">\\2</a>
where \\1
correspond matches of the first parenthesized group ,... 根据您的示例替换所有3个示例:
<a href="\\1">\\2</a>
其中\\1
对应第一个括号组的匹配项,...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.