正则表达式匹配没有<a>标签</a>的链接

Question

(http([s]?):\/\/?)(([a-zA-Z0-9]+(\.?))+)([a-zA-Z0-9]+((\.[a-zA-Z]{2,5}){1,2})((\/[a-zA-Z0-9\?&=_\-\~:/?#[\]@!\$&'()\*\+,;]*)*)((\.[a-zA-Z]{2,5}){0,2}))

This is my regex which is working well for matching the links in the string.这是我的正则表达式，它可以很好地匹配字符串中的链接。 But I don't want it to select every link.但我不希望它选择每个链接。 If a link has "> before it, or </a> after it, that link shouldn't be mathced. How can it be done?如果一个链接在它之前有"> ，或者在它之后有</a> ，则该链接不应该被计算。怎么做？

These should be matched:这些应该匹配：

adasdas http://www.stackoverflow.com asdasas
adasdasahttp://www.stackoverflow.com/something asdas

These should NOT be matched:这些不应该匹配：

adasdas<a href="somelink">           http://www.stackoverflow.com     </a>asdasas
adasdasa<a href="somelink">http://www.stackoverflow.com/something</a> asdas

Why do I need this?: I want every link to be clickable even if it isn't between anchor tags.为什么我需要这个？：我希望每个链接都可以点击，即使它不在锚标签之间。

Answer 1

With all the disclaimers about using regex to parse html, if you want to use regex for this task, this will work:有了关于使用正则表达式解析 html 的所有免责声明，如果你想使用正则表达式来完成这个任务，这将起作用：

$regex="~<a.*?</a>(*SKIP)(*F)|http://\S+~";

See the demo .请参阅演示。

This problem is a classic case of the technique explained in this question to "regex-match a pattern, excluding..."此问题是此问题中解释的“正则表达式匹配模式，不包括...”的技术的经典案例

The left side of the alternation |左侧交替| matches complete <a ...tags </a> then deliberately fails, after which the engine skips to the next position in the string.匹配完整的<a ...tags </a>然后故意失败，之后引擎跳到字符串中的下一个位置。 The right side matches the urls, and we know they are the right ones because they were not matched by the expression on the left.右侧匹配 url，我们知道它们是正确的，因为它们与左侧的表达式不匹配。

The url regex I put on the right and can be refined, just use whatever suits your needs.我放在右边的 url regex 可以改进，只需使用适合您需求的任何东西。

Reference参考

Answer 2

You need to add lookaround s to your regex cf:您需要将lookaround s 添加到您的正则表达式 cf：

Answer 3

Here's some PHP code I combined (from answers on here) for a function to do this for emails and URLs:这是我组合的一些 PHP 代码（来自此处的答案），用于为电子邮件和 URL 执行此操作的函数：

function replace_links( $content ){
    $content = preg_replace('"<a[^>]+>.+?</a>(*SKIP)(*FAIL)|\b(?:https?)://\S+"', '<a href="$0">$0</a>', $content);
    $content = preg_replace('"<a[^>]+>.+?</a>(*SKIP)(*FAIL)|\b(\S+@\S+\.\S+)\S+"', '<a href="mailto:$0">$0</a>', $content);
    return $content;
}

Demo: https://glot.io/snippets/g6nwd6amyo演示： https ://glot.io/snippets/g6nwd6amyo

Most Updated: https://gist.github.com/tripflex/0cc930c2afe5f4c73f2aed61cedf95d0最新更新： https ://gist.github.com/tripflex/0cc930c2afe5f4c73f2aed61cedf95d0

正则表达式匹配没有<a>标签</a>的链接

问题描述

3 个解决方案

解决方案1
14 已采纳 2014-07-09 12:28:29

解决方案2
1 2014-07-09 11:18:16

解决方案3
0 2022-02-01 23:42:14

正则表达式匹配没有<a>标签</a>的链接

问题描述

3 个解决方案

解决方案1 14 已采纳 2014-07-09 12:28:29

解决方案2 1 2014-07-09 11:18:16

解决方案3 0 2022-02-01 23:42:14

解决方案1
14 已采纳 2014-07-09 12:28:29

解决方案2
1 2014-07-09 11:18:16

解决方案3
0 2022-02-01 23:42:14