简体   繁体   English

在 Pandas groupby 模式下使用用户定义的函数,将其应用于多列并将结果分配给新的 Pandas 列

[英]In pandas groupby mode use user defined function, apply it to multiple columns and assign the results to new pandas columns

I have a following data set:我有以下数据集:

> dt
   a b group
1: 1 5     a
2: 2 6     a
3: 3 7     b
4: 4 8     b

I have a following function:我有以下功能:

def bigSum(a,b):
   return(a.min() + b.max())

I want to apply this function to a and b columns in groupby mode (by group) and assign it to the new column c of the data frame.我想将此函数应用于 groupby 模式(按组)中的 a 和 b 列,并将其分配给数据框的新列 c。 My wished result is我希望的结果是

    > dt
   a b group  c
1: 1 5     a  7
2: 2 6     a  7
3: 3 7     b  11
4: 4 8     b  11

For instance, if I would have used R data.table, I would do the following:例如,如果我使用 R data.table,我将执行以下操作:

dt[, c := bigSum(a,b), by = group]

and it would work exactly as I expect.它会按照我的预期工作。 I am interested if there is something similar in pandas.我很感兴趣熊猫是否有类似的东西。

In pandas we have transformpandas我们有transform

g = df.groupby('group')
df['out'] = g.a.transform('min') + g.b.transform('max')
df
Out[282]: 
   a  b group  out
1  1  5     a    7
2  2  6     a    7
3  3  7     b   11
4  4  8     b   11

Update更新

df['new'] = df.groupby('group').apply(lambda x : bigSum(x['a'],x['b'])).reindex(df.group).values
df
Out[287]: 
   a  b group  out  new
1  1  5     a    7    7
2  2  6     a    7    7
3  3  7     b   11   11
4  4  8     b   11   11

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM