简体   繁体   中英

Extracting trimmed data from string using regex in python

I have text formatted like this

address = "street: street A city: City B floor:"

I want to extract the street, city and floor from the address. Every one of this values can be blank.

>>> address_pattern = re.compile(
...     r'street:\s?(?P<street>.*)\s?'
...     r'city:\s?(?P<city>.*)\s?'
...     r'floor:\s?(?P<floor>.*)\s?'
... )
>>> address_pattern.search(address).groups()
('street A ', 'City B ', '')

As you can see there is a whitespace at the end of the strings which I am trying to avoid.

Obviously the simple solution here would be to strip the white space but where is the fun in that? If It's also possible to make it return None for empty string that would be great

Use non-greedy operators for the matched groups, and greedy operators for matching the whitespace at the end:

>>> address_pattern = re.compile(
...     r'street:\s?(?P<street>.*?)\s*'
...     r'city:\s?(?P<city>.*?)\s*'
...     r'floor:\s?(?P<floor>.*?)\s*'
... )
>>> address_pattern.search(address).groups()
('street A', 'City B', '')
>>>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM