简体   繁体   中英

regex for matching http and www urls in a php string

Here is the code i am using

function parseURL($text) {
    $regex = "#\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))#iS";
    preg_match_all($regex, $text, $matches);
    foreach($matches[0] as $pattern){
        $text = str_replace($pattern, "<a href=\"$pattern\" target=\"_blank\">$pattern</a> ", $text);   
    }
    return $text;
}

For some reason my regex is outputting the following results: (bold = linked)

www.domain.com

http:// www.domain.com

http://domain.com

so it works fine except if it contains both http and www at which point it only links from the www part onward.

any idea why?

EDIT

For anyone reading this requiring the fix, here is the working code thanks to Wiktor Stribiżew ..

function parseURL($text) {
    $regex = "@\b(([\w-]+://?|www[.])[^\s()<>]+(?:\(\w+\)|([^[:punct:]\s]|/)))@i";
    $subst = "<a href='$0' target='_blank'>$0</a>";
    $text = preg_replace($regex, $subst, $text);
    return $text;
}

You do not need to first collect matches and then replace each one by one. Use a preg_replace directly and use a $0 backreference to refer to the whole match from the replacement pattern.

See the PHP demo :

$re = '@\b(([\w-]+://?|www[.])[^\s()<>]+(?:\(\w+\)|([^[:punct:]\s]|/)))@i';
$str = "www.domain.com\nhttp://www.domain.com\nhttp://domain.com";
$subst = '<a href="$0" target="_blank">$0</a> ';
$result = preg_replace($re, $subst, $str);
echo $result;

Output:

<a href="www.domain.com" target="_blank">www.domain.com</a> 
<a href="http://www.domain.com" target="_blank">http://www.domain.com</a> 
<a href="http://domain.com" target="_blank">http://domain.com</a> 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM