简体   繁体   English

如何在此正则表达式中将捕获组转换为非捕获组?

[英]How to convert a capturing group to a non-capturing group in this regex?

I am trying to replace all urls in the text with hyperlink using regular expression.我正在尝试使用正则表达式将文本中的所有 url 替换为超链接。 The urls must start with either http:// or https:// .网址必须以http://https://开头。 And they must contain some TLD, eg .com , .org , or .co.uk etc.而且它们必须包含一些 TLD,例如.com.org.co.uk等。

Below is my regex pattern in PHP :下面是我在PHP中的regex模式:

$pattern = "/(http)+(s)?:\/\/(\S)+(\.){1}/i";

So if you use the following code:所以如果你使用下面的代码:

$str = "this http://dd is a String http://www.example.com and this a String https://anotherexample.co.uk";

echo preg_replace($pattern, "<a href='$0' target='_blank'>$0</a>", $str);

It gives me following output:它给了我以下输出:

在此处输入图像描述

You can see that the TLD part is not included in the hyperlink.您可以看到 TLD 部分未包含在超链接中。 So how can I convert capturing group (\.){1} to non-capturing group to also cover TLD?那么如何将捕获组(\.){1}转换为非捕获组以覆盖 TLD?

You can try this:你可以试试这个:

$pattern = "/https?:\/\/(\S)+(\.\w{1,4})+/i"
echo preg_replace($pattern, "<a href='$0' target='_blank'>$0</a>", $str);

In the pattern:在模式中:

https? : http or https : http 或 https

(\.\w{1,4})+ : TLDs like .co .com or something like .co.uk the maximum length of each TLD is 4 here but you can change that. (\.\w{1,4})+ :像.co .com或类似.co.uk的 TLD,这里每个 TLD 的最大长度为 4,但您可以更改它。

Use the following pattern:使用以下模式:

https?:\/\/[az]+\.[az]+[.az]* /i

  1. Keep the 's' in 'https' optional using ?保留 'https' 中的 's' 可选使用?
  2. Use [az]+ to capture the first set of letters after 'https'使用[az]+捕获 'https' 后的第一组字母
  3. Ensure there is at least one '.'确保至少有一个 '.' followed by one or more letters后跟一个或多个字母
  4. The rest of the slug is optional and can appear zero or more times [.az]*其余的 slug 是可选的,可以出现零次或多次[.az]*

Demo演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM