简体   繁体   中英

Python regex pattern matching with ranges and whitespaces

I am attempting to match strings that would have a pattern of:

  • two uppercase Latin letters
  • two digits
  • two uppercase Latin letters
  • four digits
  • ex: MH 45 LE 4098

There can be optional whitespaces between the first three and they need to be limited to these numbers of characters. I was trying to group them and set a limit on the characters, but I am not matching any strings that fall within the define parameters. I had attempted building a set like so template = '[AZ{2}0-9{2,4}]' , but was still receiving errors when the last digits had exceeded 4.

template = '(A-Z{2})\s?(\d{2})\s?(A-Z{2})\s?(\d{4})'

This was the other attempt when I tried being more verbose, but then couldn't match anything.

This is probably the regex you are looking for:

[A-Z]{2}\s?[0-9]{2}\s?[A-Z]{2}\s?[0-9]{4}

Note that it allows multiple whitespace characters.

You are close; need to put a square brackets around AZ to let {2} affect the whole range instead of only Z . As it stands it literally matches A-ZZ .

So

template = "[A-Z]{2}\s?(\d{2})\s?([A-Z]{2})\s?(\d{4})"

should do. We use [ instead of ( to imply a range of letters. If we put ( , it would try to match A-ZA-Z ie literally AZ two times.

You can see a demo here and you can change them to ( or omit them to see the effect in the demo.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM