简体   繁体   English

在PHP的字符串中查找特定URL的正则表达式

[英]Regular expression to find specific URLs within a string in PHP

I am looking to find specific URLs within a large string of text, the URLs are in this format: 我正在寻找大文本字符串中的特定URL,这些URL的格式如下:

https://name.myurl.com/#/shop/ rmpa8cmnfg3eerpus3ap9jwekz6k77pnj2pg50ua /login https://name.myurl.com/#/shop/ rmpa8cmnfg3eerpus3ap9jwekz6k77pnj2pg50ua / login

*The bold part is random. *粗体部分是随机的。

Currently I am able to extrapolate ALL URLs using the following: 目前,我可以使用以下方法推断所有 URL:

preg_match_all('!https?://\S+!', $string, $matches); 

I then need to loop around and pull out all URLs that include a specific string using: 然后,我需要遍历并使用以下命令拉出所有包含特定字符串的URL:

$arr = $matches[0];

foreach ($arr as $haystack) {

    if (strlen(strstr($haystack,"shop"))>0) {

      echo $haystack;

    }
}

I am trying to make the code more efficient and can't seem to nail down a regular expression that can find all URLs matching: 我正在尝试使代码更高效,并且似乎无法确定可以找到所有匹配的URL的正则表达式:

https://name.myurl.com/#/shop/rmpa8cmnfg3eerpus3ap9jwekz6k77pnj2pg50ua/login https://name.myurl.com/#/shop/rmpa8cmnfg3eerpus3ap9jwekz6k77pnj2pg50ua/login

If I could it would alleviate the need to do the second string lookup. 如果可以的话,可以减少第二次字符串查找的需要。

Any help would be much appreciated. 任何帮助将非常感激。

Thanks 谢谢

The point is that you need to ask yourself what is so particular in the string you need to match. 关键是您需要问自己,您需要匹配的字符串中有什么特别的。 If the URL contains a subpath of interest, if the subpart is the second, or second from the end, or it consists of both letter and digits, etc. 如果URL包含感兴趣的子路径,则该子部分是第二部分,或者是末尾部分,或者它由字母和数字组成,等等。

Once you know what to match, you can start on a regex. 一旦知道要匹配的内容,就可以开始使用正则表达式。

It seems that you need to match URLs with /shop/ subpath. 似乎您需要将URL与/shop/子路径进行匹配。 Then, all you need is to include that subpattern to the pattern. 然后,您所需要做的就是将该子模式包括在模式中。 Since it is a literal sequence of characters, there is nothing difficult about it: 由于它是字符的文字序列,因此没有什么困难:

'~https?://\S+/shop/\S+~'
              ^^^^^^

See the regex demo 正则表达式演示

If all you want to do is to verify that the /shop/ part is part of the URL, use: 如果您要做的只是验证/shop/部分是否为URL的一部分,请使用:

https?:\/\/\S*\/shop\/\S*

It's basically your regex, with the addition of requiring /shop/ after the protocol part (http(s)://), and allowing non space characters before and after the shop-part. 它基本上是您的正则表达式,另外还要求在协议部分(http(s)://)之后使用/shop/ ,并允许在shop-part之前和之后使用非空格字符。

Regards 问候

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM