简体   繁体   中英

Python Pandas DataFrame User Defined Function Transformations

I have several DataFrames which I am in the process of cleaning the data. The following code works independently (outside of a function), however, I have to apply it to many DataFrames and want to clean this process via a user defined function. Can you please help to fix the following so that it can be used for all of my dataframes.

def format_df(df):
    df.columns = df.columns.str
    df.dropna(thresh=1, axis='columns',inplace = True)
    df.dropna(thresh=80,axis=0,inplace = True)    
    df.columns = df.iloc[0]
    df = df.iloc[1:].reset_index(drop=True)
    df.columns = df.columns.str.replace(' ','',regex=False)
    df.columns = df.columns.str.replace('($)','',regex=False)
    df.columns = df.columns.str.replace('(Y/N)','Flag',regex=False)
    df.columns = df.columns.str.replace('(x)','',regex=False)
    df.columns = df.columns.str.replace('-','',regex=False)
    return df

The line df.columns = df.columns.str is not going to run because df.columns.str is a string method and df.columns is an index. Instead you can use the astype method:

df.columns = df.columns.astype(str)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM