繁体   English   中英

如何使用正则表达式查找文本中的所有URL并使其超链接?

[英]How do you find and hyperlink all URLs from text using a regular expression?

到目前为止,这是我的功能,并且效果很好,但是它开始以逗号开头超链接文本,然后返回行:

function linkify($text) {
    $url = '@(http(s)?)?(://)?(([a-zA-Z])([-\w]+\.)+([^\s\.]+[^\s]*)+[^‌,.\s])@';
    $string = preg_replace($url, '<a href="http$2://$4" target="_blank" title="$0">$0</a>', $text);
    return $string;
}

例如:

echo linkify("I went to the local food store and bought some food.

I was able to find everything.");

将返回此:

I went to the local food store and bought some <a href="http://seed.&lt;br" target="_blank" title="seed.<br">food.<br< a=""> /&gt;

<br>
I was able to find everything.</br<></a>

有人可以帮我弄清楚我在做什么错吗?

这将稍微提高原始图案的准确性。 我的图案的运行速度几乎是您的图案的两倍。 我删除了不需要的/未使用的捕获组,改进了可选//模式精度,在模式的末尾添加了不区分大小写的标志,删除了不必要的转义,并且为了简洁起见,基本上精简了您的模式。

模式演示与替换

代码:( 演示

function linkify($text){
    $capture='@(?:http(s)?://)?([a-z][-\w]+(?:\.\w+)+(?:\S+)?)@i';
    $replace='<a href="http$1://$2" target="_blank" title="$0">$0</a>';
    $string = preg_replace($capture,$replace,$text);
    return $string;
}

echo linkify("Here is a sentence with a url containing a query string: https://www.google.com/search?q=mickmackusa&oq=mickmackusa&aqs=chrome..69i57j69i60.271j0j7&sourceid=chrome&ie=UTF-8 all good."),"\n\n---\n\n";
echo linkify("http://google.com"),"\n\n---\n\n";
echo linkify("http://google.com.au"),"\n\n---\n\n";
echo linkify("https://google.com.au"),"\n\n---\n\n";
echo linkify("www.google.com"),"\n\n---\n\n";
echo linkify("google.com"),"\n\n---\n\n";
echo linkify("I went to the local food store and bought some food.\n\nI was able to find everything"),"\n\n---\n\n";
echo linkify("I went to the local food store and bought some food.

I was able to find everything");

输出:

Here is a sentence with a url containing a query string: <a href="https://www.google.com/search?q=mickmackusa&oq=mickmackusa&aqs=chrome..69i57j69i60.271j0j7&sourceid=chrome&ie=UTF-8" target="_blank" title="https://www.google.com/search?q=mickmackusa&oq=mickmackusa&aqs=chrome..69i57j69i60.271j0j7&sourceid=chrome&ie=UTF-8">https://www.google.com/search?q=mickmackusa&oq=mickmackusa&aqs=chrome..69i57j69i60.271j0j7&sourceid=chrome&ie=UTF-8</a> all good.

---

<a href="http://google.com" target="_blank" title="http://google.com">http://google.com</a>

---

<a href="http://google.com.au" target="_blank" title="http://google.com.au">http://google.com.au</a>

---

<a href="https://google.com.au" target="_blank" title="https://google.com.au">https://google.com.au</a>

---

<a href="http://www.google.com" target="_blank" title="www.google.com">www.google.com</a>

---

<a href="http://google.com" target="_blank" title="google.com">google.com</a>

---

I went to the local food store and bought some food.

I was able to find everything

---

I went to the local food store and bought some food.

I was able to find everything

这可能不是所有可能的URL的灵丹妙药,但这是一个合理的基础。 如果发现某些字符串没有按预期替换,则该模式可能需要进行一些调整。


模式更新/扩展以包括带有子域的URL:

~(?:ht{2}p(s)?:/{2})?([a-z][-\w.]+(?:\.\w+)+(?:\S+)?)~i
//  new dot here---------------^

我只是在字符类上添加了一个点。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM