简体   繁体   中英

how to use '|' in part of the whole raw string in Python Regex

Required:

Check if the text passed includes a possible US zip code, formatted as follows: exactly 5 digits, and sometimes, but not always, followed by a dash with 4 more digits. The zip code needs to be preceded by at least one space, and cannot be at the start of the text.

My Code:

import re
def check_zip_code (text):
  result = re.search(r"^.* +\d{5}", text)
  return result != None

For the occasional r"\-\d{4}" (a dash with 4 more digits), I tried to include it by changing line 3 to:

result = re.search(r"^.* +\d{5}|\-\d{4}", text)

But it does not work.

I have the following questions:

  1. How to solve the above zip code problem?
  2. How to partially use | in the whole raw string? (eg "a1|2" can match either a1 or a2 )

Some of the test cases:

print(check_zip_code("The zip codes for New York are 10001 thru 11104.")) # True
print(check_zip_code("90210 is a TV show")) # False
print(check_zip_code("Their address is: 123 Main Street, Anytown, AZ 85258-0001.")) # True
print(check_zip_code("The Parliament of Canada is at 111 Wellington St, Ottawa, ON K1A0A9.")) # False

You are looking for an optional group, not an alternation. Additionally, add a negative lookahead at the beginning. That said, you can use:

(?!^)\b\d{5}(?:-\d{4})?\b

See a demo on regex101.com .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM