简体   繁体   中英

Regular Expression, match url segments in specific order?

I would like to match url pattern that has optional segments.

I have URL-s like this:

subdomain.domain.com/page/pageurl/pagename/123/
subdomain.domain.com/page/pageurl/pagename/
subdomain.domain.com/page/pageurl/
subdomain.domain.com/page/

Now I have a regex that matches all those situations:

^([a-z]+)\.domain\.com\/page(\/[a-z]+)?(\/[a-z]+)?(\/[0-9]+)?\/?$

But this regexs fails if you go to this URL:

subdomain.domain.com/page/123/

It matches this url too, and I dont want that to happen beacuse first segment should be [az]+ and nothing else. Now I do understand why is this happening, but I cant figure out the right regexs to suite my needs. I need a regexs that would match those URL-s but in order, so if first segment after page is number, it should not match...

How would I do that? Im going crazy right now :S

Rubural example: LINK

Thanks!

I think what you need is a look-behind

^([a-z]+)\.domain\.com\/page(\/[a-z]+)?(\/[a-z]+)?((?<!\/page)\/[0-9]+)?\/?$

What the (?<!\\/page) should do is assert that '/page' does not immediately precede the numbers.

EDIT

I tested it like this:

$re = '/^([a-z]+)\.domain\.com\/page(\/[a-z]+)?(\/[a-z]+)?((?<!\/page)\/[0-9]+)?\/?$/';
foreach(array(
        'subdomain.domain.com/page/pageurl/pagename/123/',
        'subdomain.domain.com/page/pageurl/pagename/',
        'subdomain.domain.com/page/pageurl/',
        'subdomain.domain.com/page/',
        'subdomain.domain.com/page/123/',
        ) as $url
) {
    $matches = array();
    preg_match($re,$url,$matches);
    var_dump($matches);
}

and got matches for the first four, and not the last.

We can make the capturing group of the first 'segment' mandatory and all of the segments optional like so: ^([az]+)\\.domain\\.com\\/page(?:(\\/[az]+)(\\/[az]+)?(\\/[0-9]+)?)?\\/?$

Another thing that might be useful is to allow any valid subdomain, the pattern would look like this:

^([\\w.-]+)+\\.domain\\.com\\/page(?:(\\/[az]+)(\\/[az]+)?(\\/[0-9]+)?)?\\/?$

Edit: Fixed pattern, as Umbrella pointed out (thanks) my prevous pattern would not match your last example string, oops

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM