简体   繁体   中英

Apply function on each column in a pandas dataframe

How I can write following function in more pandas way:

     def calculate_df_columns_mean(self, df):
        means = {}
        for column in df.columns.columns.tolist():
            cleaned_data = self.remove_outliers(df[column].tolist())
            means[column] = np.mean(cleaned_data)
        return means

Thanks for help.

It seems to me that the iteration over the columns is unnecessary:

def calculate_df_columns_mean(self, df):
    cleaned_data = self.remove_outliers(df[column].tolist())
    return cleaned_data.mean()

the above should be enough assuming that remove_outliers still returns a df

EDIT

I think the following should work:

def calculate_df_columns_mean(self, df):
    return df.apply(lambda x: remove_outliers(x.tolist()).mean()

Use dataFrame.apply(func, axis=0) :

# axis=0 means apply to columns; axis=1 to rows
df.apply(numpy.sum, axis=0) # equiv to df.sum(0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM