简体   繁体   English

编辑正则表达式以识别 Python 2.7 脚本的非常短的街道名称或编号并避免致命属性错误

[英]Edit regex to recognise very short street name or number for Python 2.7 script and avoid fatal attribute error

I have a little script in Python 2.7 with regex that recognises when a street or avenue name is in a string.我在 Python 2.7 中有一个带有正则表达式的小脚本,可以识别街道或大道名称何时在字符串中。 However it doesn't work for street names of only two letters (OK that's very rare) OR streets with numbers (eg 29th Street)... I'd like to edit it to make the latter work (eg recognise 29th street or 1st street).但是它不适用于只有两个字母的街道名称(好吧,这非常罕见)或带有数字的街道(例如 29th Street)......我想编辑它以使后者工作(例如识别 29th street 或 1st街道)。

This works:这有效:

import re

def street(search):
    if bool(re.search('(?i)street', search)):
        found = re.search('([A-Z]\S[a-z]+\s(?i)street)', search)
        found = found.group()
        return found.title()
    if bool(re.search('(?i)avenue', search)):
        found = re.search('([A-Z]\S[a-z]+\s(?i)avenue)', search)
        found = found.group()
        return found.title()
    else:
        found = "na"
        return found

userlocation = street("I live on Stackoverflow Street")

print userlocation

Or, with "Stackoverflow Avenue"或者,使用“Stackoverflow Avenue”

......but these fail: ......但这些失败:

userlocation = street("I live on SO Street")
userlocation = street("I live on 29th Street")
userlocation = street("I live on 1st Avenue")
userlocation = street("I live on SO Avenue")

with this error (because nothing found)出现此错误(因为没有找到)

me@me:~/Documents/test$ python2.7 test_street.py
Traceback (most recent call last):
  File "test_street.py", line 12, in <module>
    userlocation = street("I live on 29th Street")
  File "test_street.py", line 6, in street
    found = found.group()
AttributeError: 'NoneType' object has no attribute 'group'

As well as correcting the query so that it recognises "1st", "2nd", "80th", "110th" etc., I'd also ideally like to avoid a fatal error if it doesn't find anything.除了更正查询以使其识别“1st”、“2nd”、“80th”、“110th”等之外,如果找不到任何东西,我还希望避免致命错误。

You can merge your two conditions inside your function, then you can match any non-space characters \S+ followed by a space and the keywords " Street " or " Avenue " ( \s(Street|Avenue) ).您可以在 function 中合并您的两个条件,然后您可以匹配任何非空格字符\S+后跟空格和关键字“ Street ”或“ Avenue ”( \s(Street|Avenue) )。

import re

def street(search):
    if bool(re.search('(?i)(street|avenue)', search)):
        found = re.search('(?i)\S+\s(Street|Avenue)', search)
        found = found.group()
        return found.title()
    else:
        found = "na"
        return found

print street("I live on Stackoverflow Street")
print street("I live on SO street")
print street("I live on 29th Street")
print street("I live on 1st Avenue")
print street("I live on SO Avenue")

Output: Output:

Stackoverflow Street
So Street
29Th Street
1St Avenue
So Avenue

This will match only the last word of the street, though if you want to match multi-worded streets and you are able to catch specific keywords that occur always before the street, then you may be able to catch your whole street name.这将仅匹配街道的最后一个单词,但如果您想匹配多字街道并且您能够捕获始终出现在街道之前的特定关键字,那么您可能能够捕获整个街道名称。

Check the regex demo here .此处查看正则表达式演示。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM