简体   繁体   中英

PHP: Regex for Matching URLs with a Certain Pattern and One Optional Wildcard

maybe this question has already been answered somewhere on this site, but I'm not sure, because I'm very bad at regex. In fact, my question is very very basic, I guess. I need to check if an URL matches the following pattern(s):

'http://www.my-domain.com/dir/file.htm'
'http://www.my-domain.com/dir/file2.htm'
'http://www.my-domain.com/dir/file3.htm'

So basically, I only need a simple regex pattern for matching URLs with one wildcard that can either be empty or contain numbers.

Thank you and sorry for my incapability to solve this very basic problem.

/^https?\\:\\/\\/www\\.my\\-domain\\.com\\/dir\\/file[0-9]*\\.htm$/ matches all of your example strings:

if (preg_match('/^https?\:\/\/www\.my\-domain\.com\/dir\/file[0-9]*\.htm$/',$url,$matches))
{
    var_dump($matches);
}

Since the regex isn't (or wasn't) clear to you: here's what this expression does:

  • ^https? : checks weather or not the string starts with http, and allows an optional s
  • \\:\\/\\/www\\.my\\-domain\\.com\\/dir\\/file : verifies the actual base url, slashes, colons, dots and dashed need to be escaped, because of their special meaning in regex syntax (slash is a common delimiter, dots == almost any char, colons following ? can be interpreted as part of conditional match,...)
  • file[0-9]*\\.html$ : matches file and either no, or any number of digits that follow the string, so this will match file , file1 as well as file0 or file00000123434 . Then the .htm is matched, and the $ ensures that this is the end of the string you are trying to match.

That's it, really. A pretty strait forward regex. You could add some more flex too it, by (for instance) allowing both html and htm to be the end of your string in the same way the expression allows for both http and https: \\.html?$ . There are other ways to write the same thing: \\.html{0,1} : matches either 0 or 1 l at the end. Or even: .[html]{3,4} which matches either 3 or 4 chars from the "group" "html": htm, html But also hhh, htth etc...

Play around with it, have fun. Regex's aren't that hard once you've got a hang of the basics

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM