在pandas数据帧中的每列上应用函数

Question

How I can write following function in more pandas way: 我如何用更多的熊猫方式编写以下函数：

     def calculate_df_columns_mean(self, df):
        means = {}
        for column in df.columns.columns.tolist():
            cleaned_data = self.remove_outliers(df[column].tolist())
            means[column] = np.mean(cleaned_data)
        return means

Thanks for help. 感谢帮助。

Answer 1

It seems to me that the iteration over the columns is unnecessary: 在我看来，对列的迭代是不必要的：

def calculate_df_columns_mean(self, df):
    cleaned_data = self.remove_outliers(df[column].tolist())
    return cleaned_data.mean()

the above should be enough assuming that remove_outliers still returns a df 假设remove_outliers仍然返回一个df，上面应该足够了

EDIT 编辑

I think the following should work: 我认为以下应该有效：

def calculate_df_columns_mean(self, df):
    return df.apply(lambda x: remove_outliers(x.tolist()).mean()

Answer 2

Use dataFrame.apply(func, axis=0) : 使用dataFrame.apply(func, axis=0) ：

# axis=0 means apply to columns; axis=1 to rows
df.apply(numpy.sum, axis=0) # equiv to df.sum(0)

在pandas数据帧中的每列上应用函数

问题描述

2 个解决方案

解决方案1
2 已采纳 2016-08-09 10:45:48

解决方案2
1 2016-08-09 10:41:15

在pandas数据帧中的每列上应用函数

问题描述

2 个解决方案

解决方案1 2 已采纳 2016-08-09 10:45:48

解决方案2 1 2016-08-09 10:41:15

解决方案1
2 已采纳 2016-08-09 10:45:48

解决方案2
1 2016-08-09 10:41:15