熊猫数据框中多个列上的聚合

Question

Data: 数据：

z = pd.DataFrame({'a':[1,1,1,2,2,3,3],'b':[3,4,5,6,7,8,9], 'c':[10,11,12,13,14,15,16]})

My code: 我的代码：

gbz = z.groupby('a')
f1 = lambda x: x.loc[x['b'] > 4]['c'].mean()
f2 = lambda x: x.mean()
f3 = {'I don't know what should I write here':{'name1':f1}, 'b':{'name2': f2}}
list1 = gbz.agg(f3)

Question: 题：

How can I put more than one column to use in function "f1" ? 如何在函数“ f1”中放置多个列？ (This function needs two columns of the groupby object) （此功能需要groupby对象的两列）

Expected result: 预期结果：

     name1  name2
1    12.0   4
2    13.5   6.5
3    15.5   8.5

Answer 1

Nested dictionary in agg function is deprecated . 不推荐使用 agg函数中的嵌套字典。 What you might do is use groupby.apply and return a properly indexed series for each group for renaming purpose: 您可能要做的是使用groupby.apply并为每个组返回正确索引的序列以进行重命名：

(z.groupby('a')
  .apply(lambda g: pd.Series({
    'name1': g.c[g.b > 4].mean(),
    'name2': g.b.mean()
})))

#  name1    name2
#a      
#1  12.0    4.0
#2  13.5    6.5
#3  15.5    8.5

Answer 2

You can use agg with a lambda like this: 您可以将agg与lambda一起使用，如下所示：

g = z.groupby('a').agg(lambda x: [x[(x.b > 4)].c.mean(), x.b.mean()])

You'll have to rename your columns manually: 您必须手动重命名列：

g.columns = ['name1', 'name2']

print(g)
   name1  name2
a              
1   12.0    4.0
2   13.5    6.5
3   15.5    8.5

熊猫数据框中多个列上的聚合

问题描述

Data: 数据：

My code: 我的代码：

Question: 题：

Expected result: 预期结果：

2 个解决方案

解决方案1
4 已采纳 2017-09-18 00:49:41

解决方案2
2 2017-09-18 01:01:31

熊猫数据框中多个列上的聚合

问题描述

Data: 数据：

My code: 我的代码：

Question: 题：

Expected result: 预期结果：

2 个解决方案

解决方案1 4 已采纳 2017-09-18 00:49:41

解决方案2 2 2017-09-18 01:01:31

解决方案1
4 已采纳 2017-09-18 00:49:41

解决方案2
2 2017-09-18 01:01:31