简体   繁体   中英

regex match and replace multiple patterns

I have a situation where a user submits an address and I have to replace user inputs to my keys. I can join this using an address without suffixes.

COVERED WAGON TRAIL

CHISHOLM TRAIL

LAKE TRAIL

CHESTNUT ST

LINCOLN STREET

to:

COVERED WAGON

CHISHOLM

LAKE

CHESTNUT

LINCOLN

However I can't comprehend how this code can be written to replace only the last word. I get:

LINCOLN

CHESTNUT

CHISHOLM

LAKEAIL

CHISHOLMAIL

COVERED WAGONL

I've tried regex verbose, re.sub and $.

import re
target = '''

LINCOLN STREET
CHESTNUT ST
CHISHOLM TR
LAKE TRAIL
CHISHOLM TRAIL
COVERED WAGON TRL

'''
rdict = {
' ST': '',
' STREET': '',
' TR': '',
' TRL': '',
}
robj = re.compile('|'.join(rdict.keys()))
re.sub(' TRL', '',target.rsplit(' ', 1)[0]), target
result = robj.sub(lambda m: rdict[m.group(0)], target)
print result

Use re.sub with $ .

target = '''
LINCOLN STREET
CHESTNUT ST
CHISHOLM TR
LAKE TRAIL
CHISHOLM TRAIL
COVERED WAGON TRL
'''

import re
print re.sub('\s+(STREET|ST|TRAIL|TRL|TR)\s*$', '', target, flags=re.M)

If you do store your string in the format:

target = '''

LINCOLN STREET
CHESTNUT ST
CHISHOLM TR
LAKE TRAIL
CHISHOLM TRAIL
COVERED WAGON TRL

'''

There is no need to use regex:

>>> print '\n'.join([x.rsplit(None, 1)[0] for x in target.strip().split('\n')])
LINCOLN
CHESTNUT
CHISHOLM
LAKE
CHISHOLM
COVERED WAGON

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM