简体   繁体   中英

Python RegEx splitting street and number from adress

I want to split street and number.

That is my current solution:

matches = re.match(r'^(?<street>[^,]*?)[,\s]*(?P<number>\d[\w\s\-/]*$)', street_number)

but it is not working for some cases. If I have such examples:

working_examples = [
    'Somestreet 1',
    'Somestreet1',
    'Somestreet1a',
    'Somestreet 1a',
    'Somestreet 1 a'
]

print(matches.groupdict()) prints for first element of working_examples:

{'street': 'Somestreet', 'number': '1'}

However in that cases:

not_working_examples = [
    'Some 1 street',
    'Some 1a street'
]

it prints

{'street': 'Some ', 'number': '1 street'}

and my goal is to have

{'street': 'Some 1 street', 'number': None}

import re


examples = [
    'Somestreet 1',
    'Somestreet1',
    'Somestreet1a',
    'Somestreet 1a',
    'Somestreet 1 a',
    'Some 1 street',
    'Some 1a street'
]

for s in examples:
    matches = re.match(r'^(?P<street>.+?)[,\s]*(?P<number>\d\s?\w?)$', s)
    if matches:
        print(matches.groups())
    else:
        print s, "doesn't match"

Output:

('Somestreet', '1')
('Somestreet', '1')
('Somestreet', '1a')
('Somestreet', '1a')
('Somestreet', '1 a')
Some 1 street doesn't match
Some 1a street doesn't match

Demo & explanation

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM