[英]In pandas groupby mode use user defined function, apply it to multiple columns and assign the results to new pandas columns
I have a following data set:我有以下数据集:
> dt
a b group
1: 1 5 a
2: 2 6 a
3: 3 7 b
4: 4 8 b
I have a following function:我有以下功能:
def bigSum(a,b):
return(a.min() + b.max())
I want to apply this function to a and b columns in groupby mode (by group) and assign it to the new column c of the data frame.我想将此函数应用于 groupby 模式(按组)中的 a 和 b 列,并将其分配给数据框的新列 c。 My wished result is我希望的结果是
> dt
a b group c
1: 1 5 a 7
2: 2 6 a 7
3: 3 7 b 11
4: 4 8 b 11
For instance, if I would have used R data.table, I would do the following:例如,如果我使用 R data.table,我将执行以下操作:
dt[, c := bigSum(a,b), by = group]
and it would work exactly as I expect.它会按照我的预期工作。 I am interested if there is something similar in pandas.我很感兴趣熊猫是否有类似的东西。
In pandas
we have transform
在pandas
我们有transform
g = df.groupby('group')
df['out'] = g.a.transform('min') + g.b.transform('max')
df
Out[282]:
a b group out
1 1 5 a 7
2 2 6 a 7
3 3 7 b 11
4 4 8 b 11
Update更新
df['new'] = df.groupby('group').apply(lambda x : bigSum(x['a'],x['b'])).reindex(df.group).values
df
Out[287]:
a b group out new
1 1 5 a 7 7
2 2 6 a 7 7
3 3 7 b 11 11
4 4 8 b 11 11
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.