[英]Match exact number of Digits and Words using Regex - Python27
I have this string. 我有这个字符串。
P O BOX 32370, CA 92263
And this Regex \\w{2} \\d{5}
这个正则表达式\\w{2} \\d{5}
But it matches both texts in bold. 但是它以粗体匹配两个文本。 "POB OX 32370 , CA 92263 " “ POB OX 32370 , CA 92263 ”
Actually I want to extract State, and Zip code. 实际上我想提取State和Zip代码。
I want to grab text starting and a space, then exact 2 alphabets, then one space, then exact 5 digits. 我想抓取一个文本开头和一个空格,然后是2个字母,然后是一个空格,然后是5位数字。
You can add word boundary \\b
to make sure the pattern doesn't have leading or trailing word characters (alphanumeric and underscore): 您可以添加单词边界\\b
以确保该模式没有开头或结尾的单词字符(字母数字和下划线):
import re
re.findall(r"\b\w{2} \d{5}\b", "P O BOX 32370, CA 92263")
#['CA 92263']
to grab text starting and space, then exact 2 alphabets , then one space, then exact 5 digits. 抓取文本的开头和空格,然后是2个 字母 ,然后是1个空格,然后是5个数字。
Unfortunately, this pattern \\b\\w{2} \\d{5}\\b
will also find a match in such strings as "PO BOX 32370, 2A 92263"
giving the result which doesn't fit your requirement. 不幸的是,这种模式\\b\\w{2} \\d{5}\\b
还会在"PO BOX 32370, 2A 92263"
等字符串中找到匹配"PO BOX 32370, 2A 92263"
从而导致结果不符合您的要求。 \\w
- matches all alpha numeric characters. \\w
匹配所有字母数字字符。
To extract State , and Zip code use the following approach with re.search() and match.groupdict() (gets all the named subgroups of the match) methods: 要提取State和Zip代码,请对re.search()和match.groupdict()使用以下方法(获取匹配的所有命名子组)方法:
s = 'P O BOX 32370, CA 92263'
m = re.search(r'\b(?P<state>[a-zA-Z]{2}) (?P<zip_code>\d{5})\b', s)
result = m.groupdict() if m else ''
print(result)
The output: 输出:
{'zip_code': '92263', 'state': 'CA'}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.