[英]Pandas: Apply custom function to groups and store result in new columns in each group
I am trying to apply a custom function to each group in a groupby object and store the result into new columns in each group itself.我正在尝试将自定义 function 应用于 groupby object 中的每个组,并将结果存储到每个组本身的新列中。 The function returns 2 values and I want to store these values separately into 2 columns in each group. function 返回 2 个值,我想将这些值分别存储到每组的 2 列中。
I have tried this:我试过这个:
# Returns True if all values in Column1 is different.
def is_unique(x):
status = True
if len(x) > 1:
a = x.to_numpy()
if (a[0] == a).all():
status = False
return status
# Finds difference of the column values and returns the value with a message.
def func(x):
d = (x['Column3'].diff()).dropna()).iloc[0]
return d, "Calculated!"
# is_unique() is another custom function used to filter unique groups.
df[['Difference', 'Message']] = df.filter(lambda x: is_unique(x['Column1'])).groupby(['Column2']).apply(lambda s: func(s))
But I am getting the error: 'DataFrameGroupBy' object does not support item assignment
但我收到错误消息: 'DataFrameGroupBy' object does not support item assignment
I don't want to reset the index and want to view the result using the get_group
function.我不想重置索引并想使用get_group
function 查看结果。 The final dataframe should look like:最终的 dataframe 应如下所示:
df.get_group('XYZ')
-----------------------------------------------------------------
| Column1 | Column2 | Column3 | Difference | Message |
-----------------------------------------------------------------
| 0 A | XYZ | 100 | | |
---------------------------------- | |
| 1 B | XYZ | 20 | 70 | Calculated! |
---------------------------------- | |
| 2 C | XYZ | 10 | | |
-----------------------------------------------------------------
What is the most efficient way to achieve this result?实现此结果的最有效方法是什么?
I think you need:我认为你需要:
def func(x):
d = (x['Column3'].diff()).dropna()).iloc[0]
last = x.index[-1]
x.loc[last, 'Difference'] = d
x.loc[last, 'Message'] = "Calculated!"
return x
df1 = df.filter(lambda x: is_unique(x['Column1']))
df1 = df1.groupby(['Column2']).apply(func)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.