简体   繁体   English

带有或不带尾部斜杠的PHP RegEx

[英]PHP RegEx with or without trailing slashes

My goal: 我的目标:

To capture the last part of a URL whether there is or isn't a trailing slash, without the trailing slash being a part of the string on a URL similar to the one following: 要捕获URL的最后一部分是否存在尾部斜杠,而不是尾部斜杠是URL上类似于以下内容的字符串的一部分:

http://foo.com/p/dPWjiVtX-C/
                 ^^^^^^^^^^
               The string I want

My issue: 我的问题:

Every way I try only allows for a trailing slash and not for a url without a trailing slash or makes the trailing slash be contained in the string I want. 我尝试的每一种方式只允许使用尾部斜杠而不是没有尾部斜杠的url,或者使尾部斜杠包含在我想要的字符串中。

What have I tried? 我试过了什么?

1. I have tried to add a slash to the end: 1. 我试图斜线添加到末尾:

  $regex = "/.*?foo\.com\/p\/(.*)\//";
  if ($c=preg_match_all ($regex, $url, $matches))
  {
    $id=$matches[1][0];
    print "ID: $id \n";
  }

This results in error when I don't have a trailing slash. 当我没有尾部斜杠时,这会导致错误。

2. I have tried to add a question mark: 2. 我已尝试添加一个问号:

  $regex = "/.*?foo\.com\/p\/(.*)[\/]?/";

This results in the slash, if exists, being inside my string. 这会导致斜杠(如果存在)位于我的字符串中。

My question/tl;dr: 我的问题/ tl;博士:

How can I build a RegEx to not require a slash, yet keep the slash out of my preceding string? 如何构建一个不需要斜杠的RegEx,但是将斜杠保留在前面的字符串中?

Your .* is greedy by default, so if it can "eat" the slash in the capturing group, it will. 你的.*默认是贪婪的,所以如果它可以“吃掉”捕获组中的斜线,它就会。

To make it not greedy, you need .*? 为了使它不贪心,你需要.*? in the place of the .* in your capturing group. 在你的捕获组中代替.* So, your regex will be: 所以,你的正则表达式将是:

$regex = "/^.*?instagram\.com\/p\/(.*?)[\/]?$/";

You can use this to capture all characters except the trailing slash in your group: 您可以使用它来捕获除组中的尾部斜杠之外的所有字符:

$regex = "/.*?instagram\.com\/p\/([^\/]*)/"

Or alternatively, you can use a non-greedy quantifier in your group, you'll have to specify a trailing slash or the end of the string (or some other terminator) in order for the group to capture your id: 或者,您可以在组中使用非贪婪的量词,您必须指定一个尾部斜杠或字符串的结尾(或其他一些终结符),以便该组捕获您的ID:

$regex = "/.*?instagram\.com\/p\/(.*?)(?:\/|$)/"

Something you might try perhaps: 也许你可能尝试的东西:

([^\/]+)\/?$

Demo on regex101 在regex101上演示

EDIT: Huh, you should have mentioned you need to check the site as well, since you put foo.com in your first example string... (and re-edited your question after that...). 编辑:嗯,你应该提到你需要检查网站,因为你把foo.com放在你的第一个示例字符串中......(并在之后重新编辑你的问题......)。

You can use this instead to check the site: 您可以使用它来检查网站:

^.*foo\.com.*?([^\/]+)\/?$

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM