简体   繁体   中英

How to match a particular URL pattern in Python regex

I am having trouble matching a pattern of this format: p#.g.com where # is not a 1 or a 2. For instance if the pattern is p1.g.com , I don't need to match. If it it p2.g.com , I don't need to match.

But if it is any other number, such as p3.g.com or p29.g.com , then I need to match.

My current pattern is r"(?P<url>p([^1,2])\\.g\\.com)" , but this fails if the pattern is p##.g.com, basically any two digit number it fails on. There is no upper limit on the #, so it could be a 3 or 999 or anything in between.

I also tried r"(?P<url>p([^1,2])\\d+\\.g\\.com)" but that does not match any number beginning with a 1 or a 2. For instance 11 or 23 are not matched, which I do want matched.

Try this regex:

p(?:[03-9]|\d{2,})\.g\.com

Demo

Explanation:

  • Matches character p
  • Start of non-capturing group
    • Match one of:
      • The digits 0 or 3-9
      • Any double digit number like 10 or higher
  • Matches character .g.com

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM