I have a dataset that looks like this:
Column1
-------
abcd - efghi 1234
aasdas - asdas 54321
asda-asd 2344
aasdas(asd) 5234
I want to be able to pull everything out that will exclude a number so it will look like this:
Column2
-------
abcd - efghi
aasdas - asdas
asda-asd
aasdas(asd)
This is my current regex:
df['Column2'] = df['Column1'].str.extract('([A-Z]\w{0,})', expand=True)
But it only extracts out the first word that excludes parenthesis and hyphens. Any help will be appreciated...thank you!
Like using replace
df.Column1.str.replace('\d+','')
Out[775]:
0 abcd-efghi
1 aasdas-asdas
2 asda-asd
3 aasdas(asd)
Name: Column1, dtype: object
#df.Column1=df.Column1.str.replace('\d+','')
Just removing numbers will leave you with unwanted space characters.
This list comprehension removes all digits and keeps space characters, but removes them on the outside.
df['Column2'] = df['Column1'].apply(
lambda x: ''.join([i for i in x if not i.isdigit()]).strip())
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.