简体   繁体   中英

Regex Expression to remove some whitespace with a look ahead and behind

I'm working through a dataframe in python and cleaning up records. There are some with store numbers and slashes and whitespace that I need to remove. Leaving only a name and suburb.

An example of the text I'm working with is below:

Storename (Suburb / 1234     )
Storename (Surbub Suburb / 1234      )

I'm trying to get the regex to remove the spaces behind the closing bracket, but only up to the letters.

With the net result becoming:

Storename (Suburb)
Storename (Suburb)

I've been able to get the slash and numbers out with this:

test.LocationName.str.replace('[/0-9]','',regex=True)

But can't decode the regex to remove that whitespace behind the closing parenthesis.

You might use

test.LocationName.str.replace('\s*/\s*\d+\s*','',regex=True)

See a demo on regex101.com .

Use re.sub :

re.sub("\((\S+).+?\)", "(\\1)", "Storename (Suburb / 1234     )")
re.sub("\((\S+).+?\)", "(\\1)", "Storename (Surbub Suburb / 1234      )")

Output:

'Storename (Suburb)'
'Storename (Surbub)'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM