简体   繁体   中英

Include multiple (wildcard) URIs, exclude subdomains in regular expression

Hope you can help me out with formatting the correct RegEx.

I want to:


1) Include all traffic to domain.com(.*)

  • EXCLUDING all subdomains
  • EXCEPT all traffic to the specific URI sub.domain.com/folder(.*)

2) Include all traffic to the specific URI sub.extdomain.com/folder(.*)

Some examples:

Include:

  • domain.com
  • domain.com/team
  • domain.com/blog
  • Specific: hello.domain.com/bonjour
  • Specific: bye.extdomain.com/aurevoir/salut

Exclude:

  • hello.domain.com
  • bye.domain.com
  • All other subdomains & other sites

Already tried the following, but it still includes subdomains:

(domain\.com|sub\.domain\.com/folder(.*)|sub\.domain\.com/folder(.*))

The regex /domain\\.com/ will match any subdomain because it will match part of the string. use /^domain\\.com/ to catch only strings beginning with "domain.com" (no sub domain).

Note this assumes you removed the protocol from the url (http://).

The 2nd part of the RegEx you posted is the same as the 3rd part - I assume you mean the two special cases - they too need the "^" beginning

No need for the (.*) at the end - it will match part of the string the all the same without it.

(^domain\\.com|^hello\\.domain\\.com\\/bonjur|\\^bye.extdomain\\.com\\/folder(.*))

explanation - accept

  • ^domain\\.com - all urls beginning with "domain.com" (no subdomain)
  • or ^hello\\.domain.com - all urls beginning with subdomain "hello.domain.com"
  • or \\^bye.extdomain\\.com - all urls beginning with "bye.extdomain.com"

optionally - because all 3 components start the same way you can extract the common prefix ^ :

^(domain\\.com|hello\\.domain\\.com\\/bonjur|bye\\.extdomain\\.com\\/folder2)

See this website for help reading the regex: http://www.regexper.com/#%5E(domain%5C.com%7Chello%5C.domain%5C.com%5C%2Fbonjur%7Cbye%5C.extdomain%5C.com%5C%2Ffolder2)

I added a "^" to the start of the regex to require that the string begins with domain.com. In the second clause it allows for folders following domain.com. Third clause allows for anything on a sub domain, if it has a "/" followed by some text.

(^domain\.com$|^domain\.com\/\w*|\w*\.domain\.com\/\w*)

我建议使用此正则表达式:

'#\b(?:domain\.com|hello\.domain\.com/bonjour|bye\.extdomain\.com/aurevoir/salut)\b#i'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM