简体   繁体   中英

Replacing multiple special characters on dataframe

I have dataframe with 80+ columns and column names has special characters like colon, Ampersand, Slash, Minus, space and Parenthesis.

Currently, I'm doing multiple replace on df.columns, my code as follows:

df.columns = df.columns.str.replace(' ','').str.replace('-','').str.replace('&','').str.replace('/','').str.replace(':','').str.replace('(','').str.replace(')','')

Is there any betters ways to replace all special characters at once on dataframe column names.

You can use a regex:

df.columns = df.columns.str.replace(r'[ \-&/:()]', '', regex=True)

Some characters need to be escaped in regexes, so to be safe you can get re to do it for you:

import re
regex = '['+re.escape(r' -&/:()')+']'
# '[\\ \\-\\&/:\\(\\)]'
df.columns = df.columns.str.replace(regex, '', regex=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM