如何在 pandas 中的分组数据上按列应用用户定义的 function

Question

How to apply a user defined function column wise on grouped data in pandas.如何将用户定义的 function 列应用于 pandas 中的分组数据。 The user defined function returns a series of fixed shape.用户定义的 function 返回一系列固定的形状。

def getStats(col):
names = ['mean', 'std']
return pd.Series([np.mean(col), np.std(col)], index = names, name = col.name)

df = pd.DataFrame({'city':['c1','c2','c1','c2'],
               'age':[10,20,30,40],
               'sal':[1000,2000,3000,4000]})

grp_data = df.groupby('city')
grp_data.apply(getStats)

I have tried above snippet.我已经尝试过上面的片段。 But I am not getting the result in expected format.但我没有得到预期格式的结果。

city|城市| level|等级| age |年龄 | sal萨尔

c1 | c1 | mean |意思| x | x | y是的

c2 | c2 | std |标准 | x1 | x1 | y1 y1

Could you pls help on this.你能帮忙吗？

Answer 1

I think custom function here is not necessary, rather aggregate by GroupBy.agg with list of aggregate functions and reshape by DataFrame.stack , last DataFrame.rename_axis is for city and level labels:我认为这里没有必要自定义level ，而是通过GroupBy.agg聚合函数列表并通过DataFrame.stack重塑，最后一个DataFrame.rename_axis是city名称轴和标签：rename。

df = df.groupby('city').agg([np.mean,np.std]).stack().rename_axis(['city','level'])
print (df)
                  age          sal
city level                        
c1   mean   20.000000  2000.000000
     std    14.142136  1414.213562
c2   mean   30.000000  3000.000000
     std    14.142136  1414.213562

def q(c):
    def f1(x):
        return x.quantile(c)
    f1.__name__ = f'q{c}'
    return f1

df = (df.groupby('city')
        .agg([np.mean,np.std, q(0.25), q(0.75)])
        .stack()
        .rename_axis(['city','level']))

print (df)
                  age          sal
city level                        
c1   mean   20.000000  2000.000000
     std    14.142136  1414.213562
     q0.25  15.000000  1500.000000
     q0.75  25.000000  2500.000000
c2   mean   30.000000  3000.000000
     std    14.142136  1414.213562
     q0.25  25.000000  2500.000000
     q0.75  35.000000  3500.000000

如何在 pandas 中的分组数据上按列应用用户定义的 function

问题描述

city|城市| level|等级| age |年龄 | sal萨尔

1 个解决方案

解决方案1
1 已采纳 2020-06-08 11:49:17

如何在 pandas 中的分组数据上按列应用用户定义的 function

问题描述

city|城市| level|等级| age |年龄 | sal萨尔

1 个解决方案

解决方案1 1 已采纳 2020-06-08 11:49:17

解决方案1
1 已采纳 2020-06-08 11:49:17