正则表达式链接URL

Question

I currently have the following regex to capture link text and a URL in the following format: 我目前有以下正则表达式以下列格式捕获链接文本和URL：

[Link](http://link.com)

\\[(.+)]\\(((https?:\\/\\/(?:www\\.|(?!www))[^\\s\\.]+\\.[^\\s]{2,}|www\\.[^\\s]+\\.[^\\s]{2,}))\\)

When I add another expression afterwards to linkify URLs, it messes up ones in the above format. 当我之后添加另一个表达式来链接URL时，它会使上述格式的内容混乱。

Is there a singular regular expression to handle both cases? 是否存在用于处理这两种情况的单数正则表达式？

http://link.com -> <a href="http://link.com" target="_blank">http://link.com</a> http://link.com > <a href="http://link.com" target="_blank">http://link.com</a>

[Link](http://link.com) -> <a href="http://link.com" target="_blank">Link</a> [Link](http://link.com) -> <a href="http://link.com" target="_blank">Link</a>

PHP: PHP：

$string = preg_replace('/\[(.+)]\(((https?:\/\/(?:www\.|(?!www))[^\s\.]+\.[^\s]{2,}|www\.[^\s]+\.[^\s]{2,}))\)/', '<a href="$2" target="_blank">$1</a>', $string);

Answer 1

There's no real ways to identify an url in a string since the url syntax can be very complicated (too complicated to be clear). 由于url语法可能非常复杂（太复杂而难以理解），因此没有真正的方法来标识字符串中的url。 In other words, you must accept that something that looks like [...](...) stands for a link without to try to verify if the content between ( and ) is really an URL. 换句话说，您必须接受看起来像[...](...)代表链接，而无需尝试验证(和)之间的内容是否确实是URL。 (You can always use parse_url after, but keep in mind that it may exclude valid urls) . （之后，您始终可以使用parse_url ，但请记住，它可能会排除有效的url） 。

What you are looking for is: 您正在寻找的是：

$result = preg_replace('~\[([^]]*)]\([^)]*\)~', '<a href="$2" target="_blank">$1</a>', $str);

// If you want to hunt lonely urls in your text, you can always search
// after extracting text nodes with XPath and a naive pattern like this:

$dom = new DOMDocument;
$dom->loadHTML($result);

$xp = new DOMXPath($dom);
$textNodes = $xp->query('//text()');

foreach($textNodes as $textNode) {
    $textNode->nodeValue = preg_replace('~[hw](?:(?<=\bh)ttps?://|(?<=\bw)ww\.)\S+~i', '<a href="$0" target="_blank">$0</a>~', $textNode->nodeValue);
}

$result = $dom->saveHTML();

Note: for better results, if you absolutely want to check the url, you can use the same pattern with preg_replace_callback , remove the last character of the match until parse_url works and perform the replacement, but it will not be very performant. 注意：为了获得更好的结果，如果您绝对要检查url，则可以将相同的模式与preg_replace_callback ，删除匹配的最后一个字符，直到parse_url有效并执行替换为止，但是效果不佳。

Answer 2

Maybe this help you a bit: 也许这对您有所帮助：

/**
 * Linkify Function
 * @param $tweet
 * @return mixed
 */
function linkify_tweet($tweet)
{
//Convert urls to <a> links
$tweet = preg_replace("/([\w]+\:\/\/[\w-?&;#~=\.\/\@]+[\w\/])/", "<a href=\"mailto:w2m@bachecubano.com?subject=WEB $1\">$1</a>", $tweet);

//Convert hashtags to twitter searches in <a> links
$tweet = preg_replace("/#([A-Za-z0-9\/\.]*)/", "<a href=\"#\">#$1</a>", $tweet);

//Convert attags to twitter profiles in <a> links
$tweet = preg_replace("/@([A-Za-z0-9\/\.]*)/", "<a href=\"mailto:w2m@bachecubano.com?subject=MSG @$1\" class=\"userlink\">@$1</a>", $tweet);

return $tweet;
}

Answer 3

First deal with markdown syntax. 首先处理markdown语法。 Then catch plain links that were not processed - you may use similar regexp, but without parethesis. 然后捕获未处理的纯链接-您可以使用类似的正则表达式，但不带括号。 If you want to replace everything that looks like an url within whitespace limits (html won't match) then this will do: 如果您想替换所有在空白字符限制内（例如html都不匹配）的url，则可以这样做：

\\s(https?:\\/\\/(?:www\\.|(?!www))[^\\s.]+\\.[^\\s]{2,}|www\\.[^\\s]+\\.[^\\s]{2,})

正则表达式链接URL

问题描述

3 个解决方案

解决方案1
2 2016-06-16 01:02:50

解决方案2
0 2016-06-16 01:36:03

解决方案3
0 2016-06-16 03:01:28

正则表达式链接URL

问题描述

3 个解决方案

解决方案1 2 2016-06-16 01:02:50

解决方案2 0 2016-06-16 01:36:03

解决方案3 0 2016-06-16 03:01:28

解决方案1
2 2016-06-16 01:02:50

解决方案2
0 2016-06-16 01:36:03

解决方案3
0 2016-06-16 03:01:28