简体   繁体   中英

Regex to match specific URL fragment and not all other URL possibilities

I have - let say - example.com website and there I have account page. It may have GET parameters, which is also considered part of account page. It also may have URL fragment. If it's home.html fragment - it is still the account page. And if another fragment - then it's a different sub-page of the account page.

So - I need a RegEx (JS) to match this case. This is what I managed to build so far:

example.com\/account\/(|.*\#home\.html|(\?(?!.*#.*)))$

https://regex101.com/r/ihjCIg/1

The first 4 are the cases I need. And as you see - the second row is not matched by my RegEx.

What am I missing here?

You could create 2 optional groups, 1 to optionally match ? and matching any char except # and another optional group matching #home.html

Note to escape the dot to match it literally.

^example\.com\/account\/(?:\?[^#\r\n]*)?(?:#home\.html)?$
  • ^ Start of string
  • example\\.com\\/account\\/ Match start
  • (?: Non capturing group
    • \\?[^#\\r\\n]* Match ? and 0+ times any char except # or a newline
  • )? Close group and make it optional
  • (?: Non capturing group
    • #home\\.html Match #home.html
  • )? Close group and make it optional
  • $

Regex demo

 let pattern = /^example\\.com\\/account\\/(?:\\?[^#\\r\\n]*)?(?:#home\\.html)?$/; [ "example.com/account/", "example.com/account/?brand=mine", "example.com/account/#home.html", "example.com/account/?brand=mine#home.html", "example.com/account/#other.html", "example.com/account/?brand=mine#other.html" ].forEach(url => console.log(url + " --> " + pattern.test(url))); 

Third alternative in your group has a negative look ahead which ensures it rejects any text that contains a # but you haven't specifically mentioned anything that should match rest of the content till end of line. Check this updated regex demo,

https://regex101.com/r/ihjCIg/3

If you notice, I have escaped your first dot just before com and have added .* after the negative look ahead part so it matches your second sample.

example\.com\/account\/((\??[^#\r\n]+)?(#?home\.html)?)?$

This matches your first four strings

example.com/account/
example.com/account/?brand=mine
example.com/account/#home.html
example.com/account/?brand=mine#home.html

and excludes your last two

example.com/account/#other.html
example.com/account/?brand=mine#other.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM