简体   繁体   English

preg_match_all 和 foreach 只替换最后一个匹配

[英]preg_match_all and foreach only replacing last match

I have the following code, which should make plain text links clickable.我有以下代码,它应该使纯文本链接可点击。 However, if there are several links, it only replaces the last one.但是,如果有多个链接,它只会替换最后一个。

Code:代码:

$nc = preg_match_all('#<pre[\s\S]*</pre>#U', $postbits, $matches_code); 
foreach($matches_code[0] AS $match_code) 
{
    $match = null;
    $matches = null;
    $url_regex = '#https?://(\w*:\w*@)?[-\w.]+(:\d+)?(/([\w/_.]*(\?\S+)?)?)?[^<\.,:;"\'\s]+#'; 
    $n = preg_match_all($url_regex, $match_code, $matches);
    foreach($matches[0] AS $match)
    {
        $html_url = '<a href="' . $match . '" target="_blank">' . $match . '</a>';
        $match_string = str_replace($match, $html_url, $match_code);
    }
    $postbits = str_replace($match_code, $match_string, $postbits); 
}

Result:结果:

http://www.google.com

http://www.yahoo.com

http://www.microsoft.com/ <-- only this one is clickable

Expected result:预期结果:

http://www.google.com http://www.google.com

http://www.microsoft.com/ http://www.microsoft.com/

Where is my error?我的错误在哪里?

if there are several links it only replaces the last one如果有多个链接,它只会替换最后一个

Where is my error?我的错误在哪里?

Actually, it's replacing all 3 links, but it replaces the original string each time.实际上,它正在替换所有 3 个链接,但每次都会替换原始字符串。

foreach($matches[0] AS $match)
{
    $html_url = '<a href="' . $match . '" target="_blank">' . $match . '</a>';
    $match_string = str_replace($match, $html_url, $match_code);
}

The loop is executed 3 times, each time it replaces 1 link in $match_code and assigns the result to $match_string .循环执行 3 次,每次替换$match_code 1 个链接并将结果分配给$match_string On the first iteration, $match_string is assigned the result with a clickable google.com .在第一次迭代中, $match_string被分配了一个可点击的google.com的结果。 On the second iteration, $match_string is assigned with a clickable yahoo.com .在第二次迭代中, $match_string被分配了一个可点击的yahoo.com However, you've just replaced the original string, so google.com is not clickable now.但是,您刚刚替换了原始字符串,因此现在无法点击google.com That's why you only get your last link as a result.这就是为什么您只能获得最后一个链接的原因。


There are a couple of things you may also want to correct in your code:您可能还想在代码中更正以下几点:

  1. The regex #<pre[\\s\\S]*</pre>#U is better constructed as #<pre.*</pre>#Us .正则表达式#<pre[\\s\\S]*</pre>#U最好构造为#<pre.*</pre>#Us The class [\\s\\S]* is normally used in JavaScript, where there is no s flag to allow dots matching newlines.[\\s\\S]*通常用于 JavaScript,其中没有s标志来允许点匹配换行符。
  2. I don't get why you're using that pattern to match URLs.我不明白您为什么要使用该模式来匹配 URL。 I think you could simply use https?://\\S+ .我认为您可以简单地使用https?://\\S+ I'll also link you to some alternatives here .我还将在这里为您提供一些替代方案。
  3. You're using 2 preg_match_all() calls and 1 str_replace() call for the same text, where you could wrap it up in 1 preg_replace() .您正在对同一文本使用 2 个preg_match_all()调用和 1 个str_replace()调用,您可以将它包装在 1 个preg_replace()

Code代码

$postbits = "
<pre>
http://www.google.com

http://w...content-available-to-author-only...o.com

http://www.microsoft.com/ <-- only this one clickable
</pre>";


$regex = '#\G((?:(?!\A)|.*<pre)(?:(?!</pre>).)*)(https?://\S+?)#isU';
$repl = '\1<a href="\2" target="_blank">\2</a>';

$postbits = preg_replace( $regex, $repl, $postbits);

ideone demo ideone 演示

Regex正则表达式

  • \\G Always from the first matching position in the subject. \\G总是从主题中的第一个匹配位置开始。
  • Group 1第一组
    • (?:(?!\\A)|.*<pre) Matches the first <pre tag from the beggining of the string, or allows to get the next <pre tag if no more URLs found in this tag. (?:(?!\\A)|.*<pre)匹配字符串开头的第一个<pre标签,或者如果在此标签中找不到更多 URL,则允许获取下一个<pre标签。
    • (?:(?!</pre>).)*) Consumes any chars inside a <pre> tag. (?:(?!</pre>).)*)使用<pre>标签内的任何字符。
  • Group 2第 2 组
    • (https?://\\S+?) Matches 1 URL. (https?://\\S+?)匹配 1 个 URL。

正则表达式可视化

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM