简体   繁体   English

使用正则表达式匹配确切的位数和单词数-Python27

[英]Match exact number of Digits and Words using Regex - Python27

I have this string. 我有这个字符串。

P O BOX 32370, CA 92263

And this Regex \\w{2} \\d{5} 这个正则表达式\\w{2} \\d{5}

But it matches both texts in bold. 但是它以粗体匹配两个文本。 "POB OX 32370 , CA 92263 " “ POB OX 32370CA 92263

Actually I want to extract State, and Zip code. 实际上我想提取State和Zip代码。

I want to grab text starting and a space, then exact 2 alphabets, then one space, then exact 5 digits. 我想抓取一个文本开头和一个空格,然后是2个字母,然后是一个空格,然后是5位数字。

You can add word boundary \\b to make sure the pattern doesn't have leading or trailing word characters (alphanumeric and underscore): 您可以添加单词边界\\b以确保该模式没有开头或结尾的单词字符(字母数字和下划线):

import re

re.findall(r"\b\w{2} \d{5}\b", "P O BOX 32370, CA 92263")
#['CA 92263']

to grab text starting and space, then exact 2 alphabets , then one space, then exact 5 digits. 抓取文本的开头和空格,然后是2个 字母 ,然后是1个空格,然后是5个数字。

Unfortunately, this pattern \\b\\w{2} \\d{5}\\b will also find a match in such strings as "PO BOX 32370, 2A 92263" giving the result which doesn't fit your requirement. 不幸的是,这种模式\\b\\w{2} \\d{5}\\b还会在"PO BOX 32370, 2A 92263"等字符串中找到匹配"PO BOX 32370, 2A 92263"从而导致结果不符合您的要求。 \\w - matches all alpha numeric characters. \\w匹配所有字母数字字符。
To extract State , and Zip code use the following approach with re.search() and match.groupdict() (gets all the named subgroups of the match) methods: 要提取StateZip代码,请re.search()match.groupdict()使用以下方法(获取匹配的所有命名子组)方法:

s = 'P O BOX 32370, CA 92263'
m = re.search(r'\b(?P<state>[a-zA-Z]{2}) (?P<zip_code>\d{5})\b', s)
result = m.groupdict() if m else ''

print(result)

The output: 输出:

{'zip_code': '92263', 'state': 'CA'}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM