简体   繁体   中英

How to convert a capturing group to a non-capturing group in this regex?

I am trying to replace all urls in the text with hyperlink using regular expression. The urls must start with either http:// or https:// . And they must contain some TLD, eg .com , .org , or .co.uk etc.

Below is my regex pattern in PHP :

$pattern = "/(http)+(s)?:\/\/(\S)+(\.){1}/i";

So if you use the following code:

$str = "this http://dd is a String http://www.example.com and this a String https://anotherexample.co.uk";

echo preg_replace($pattern, "<a href='$0' target='_blank'>$0</a>", $str);

It gives me following output:

在此处输入图像描述

You can see that the TLD part is not included in the hyperlink. So how can I convert capturing group (\.){1} to non-capturing group to also cover TLD?

You can try this:

$pattern = "/https?:\/\/(\S)+(\.\w{1,4})+/i"
echo preg_replace($pattern, "<a href='$0' target='_blank'>$0</a>", $str);

In the pattern:

https? : http or https

(\.\w{1,4})+ : TLDs like .co .com or something like .co.uk the maximum length of each TLD is 4 here but you can change that.

Use the following pattern:

https?:\/\/[az]+\.[az]+[.az]* /i

  1. Keep the 's' in 'https' optional using ?
  2. Use [az]+ to capture the first set of letters after 'https'
  3. Ensure there is at least one '.' followed by one or more letters
  4. The rest of the slug is optional and can appear zero or more times [.az]*

Demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM