简体   繁体   English

在 Python (2.7) 中使用正则表达式查找短/不完整的英国邮政编码

[英]Imrove regex in Python (2.7) to find short / incomplete UK postcodes

I have a little function that finds full UK postcodes (eg DE2 7TT) in strings and returns them accordingly.我有一点 function 可以在字符串中找到完整的英国邮政编码(例如 DE2 7TT)并相应地返回它们。

However, I'd like to change it to ALSO return postcodes it gets where there's either one or two letters and then one or two numbers (eg SE3, E2, SE45, E34).但是,我想将其更改为 ALSO 返回它得到的邮政编码,其中有一个或两个字母,然后是一个或两个数字(例如 SE3、E2、SE45、E34)。

ie it must collect BOTH forms of UK postcode (incomplete and complete).即它必须收集英国邮政编码的 BOTH forms(不完整和完整)。

The code is:代码是:

def pcsearch(postcode):
    if bool(re.search('(?i)[A-Z]{1,2}[0-9R][0-9A-Z]? [0-9][A-Z]{2}', postcode)):
        postcode = re.search('(?i)[A-Z]{1,2}[0-9R][0-9A-Z]? [0-9][A-Z]{2}', postcode)
        postcode = postcode.group()
        return postcode
    else:
        postcode = "na"
        return postcode

What tweaks are needed to get this to ALSO work with those shorter, incomplete, postcodes?需要进行哪些调整才能使其也适用于那些较短、不完整的邮政编码?

You might write the pattern using an alternation and word boundaries.您可以使用交替和单词边界来编写模式。

(?i)\b(?:[A-Z]{1,2}[0-9R][0-9A-Z]? [0-9][A-Z]{2}|[A-Z]{1,2}\d{1,2})\b

Regex demo正则表达式演示

The code could be refactored using the pattern only once by checking the match:通过检查匹配,代码可以只使用模式重构一次:

import re

def pcsearch(postcode):
       pattern = r"(?i)\b(?:[A-Z]{1,2}[0-9R][0-9A-Z]? [0-9][A-Z]{2}|[A-Z]{1,2}\d{1,2})\b"
       match = re.search(pattern, postcode)
       if match:
              return match.group()
       else:
              return  "na"

strings = [
       "SE3",
       "E2",
       "SE45",
       "E34",
       "DE2 7TT",
       "E123",
       "SE222"
]

for s in strings:
       print(pcsearch(s))

Output Output

SE3
E2
SE45
E34
DE2 7TT
na
na

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM