如何将多个相互关联的列传递给groupby和agg上的函数？

Question

I have the following pandas DataFrame df : 我有以下熊猫DataFrame df ：

id  col1   col2
1   7      1.2
1   6      0.8
1   12     0.9
1   1      1.1
2   3      2.0
2   6      1.8
3   10     0.7
3   11     0.9
3   12     1.2

Here is the code to create this df : 这是创建此df的代码：

import pandas as pd
df = pd.DataFrame({'id': [1,1,1,1,2,2,3,3,3], 
                   'col1': [7,6,12,1,3,6,10,11,12],
                   'col2': [1.2,0.8,0.9,1.1,2.0,1.8,0.7,0.9,1.2]})

I need to group by id and apply the function myfunc to each group. 我需要按id分组，并将函数myfunc应用于每个组。 The problem is that myfunc requires several interrelated columns as an input. 问题是myfunc需要几个相互关联的列作为输入。 The final goal is to create a new column new_col for each id . 最终目标是为每个id创建一个新列new_col 。

How can I do it? 我该怎么做？

This is my current code: 这是我当前的代码：

def myfunc(df, col1, col2):

    df1 = col1
    df2 = df[df[col2] < 1][[col1]]
    var1 = df1.iloc[0]
    var2 = df2.iloc[0][0]

    result = var2 - var1

    return result


df["new_col"] = df.groupby("id").agg(myfunc(...??))

Answer 1

In groupby-apply, my_func() is passed the entire group, with all columns. 在groupby-apply中， my_func()传递给整个组以及所有列。 You can simply select the columns from that group: 您可以简单地从该组中选择列：

def myfunc(g):
    var1 = g['col1'].iloc[0]
    var2 = g.loc[g['col2'] > 1, 'col1'].iloc[0]

    return var1 / var2

df['new_col'] = df.groupby("id").apply(myfunc)

如何将多个相互关联的列传递给groupby和agg上的函数？

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-07-18 09:41:01

如何将多个相互关联的列传递给groupby和agg上的函数？

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-07-18 09:41:01

解决方案1
0 已采纳 2019-07-18 09:41:01