groupby 并应用 function 到 pandas dataframe

Question

I have pandas dataframe with a columns client_id, customer_id, overall_date, fetched_date, cal_value, expo_value .我有 pandas dataframe 与列client_id, customer_id, overall_date, fetched_date, cal_value, expo_value 。 based on this columns i have to apply business formula to predict output_value column using groupby condition to client_id & customer_id.基于此列，我必须应用业务公式来预测output_value列，使用 groupby 条件到 client_id 和 customer_id。

am unable to iterate each row and fetch output from a given dataframe我无法迭代每一行并从给定的 dataframe 获取 output

below is the function i have written for dataframe but its not working.下面是我为 dataframe 编写的 function 但它不起作用。

def cal_df(df):
    df= df.groupby(['client_id','customer_id'].reset_index()
    for i in df.iterows():
# loop to iterate each row & calculate values
        df.iloc[i]= df.iloc['cal_value'][i]/30 * df.iloc['expo_value'][:-1][i] + df.iloc['cal_value'][i]/60 *df.iloc['expo_value'][:-2][i]
    return df

data = data.apply(lambda x:cal_df(df))

Formula : (df['cal_value']/30) * df['expo_value']["if calculating for August month July month value should pick"] + df['cal_value']/60 * df['expo_value']["here June Month Value should pick"]公式：(df['cal_value']/30) * df['expo_value']["如果计算8月份7月份的值应该选择"] + df['cal_value']/60 * df['expo_value'][ “这里应该选择六月值”]

Example: Based on gropuby client_id, customer_id below formulation should be calculated:示例：根据 gropuby client_id，应计算以下公式的 customer_id：

For clientId 1) (45.9/30) * 777 +(45.9/30) * 289 = 1188.1+442.17 = 1630.27对于 clientId 1) (45.9/30) * 777 +(45.9/30) * 289 = 1188.1+442.17 = 1630.27
For clientId 2) (36.0/30) * 663 +(36.9/30) * 181 = 795.6+217.2 = 1012.8对于 clientId 2) (36.0/30) * 663 +(36.9/30) * 181 = 795.6+217.2 = 1012.8

Input Dataframe输入 Dataframe

client_id    expo_value  overall_date  customer_id   fetched_date     cal_value
1             289      2022-06-01      1449          2022-08-01        45.9
1             777      2022-07-01      1449          2022-08-01        45.9
1             155      2022-08-01      1449          2022-08-01        45.9

2             181      2022-06-01      2700          2022-08-01        36.0
2             663      2022-07-01      2700          2022-08-01        36.0
2             136      2022-08-01      2700          2022-08-01        36.0

Output Dataframe Output Dataframe

client_id expo_value overall_date  customer_id fetched_date  cal_value   output_value

1          155      2022-08-01      1449        2022-08-01     45.9         1630.27

2          136      2022-08-01      2700        2022-08-01     36.0          1012.8

Answer 1

you could also apply a regular function to a groupby, so this might work:您还可以将常规 function 应用于 groupby，因此这可能有效：

def get_result(df0):
    df0['output_val'] = (df0['cal_value'] * df0['expo_value'] / 30).iloc[:2].sum()
    return df0.drop_duplicates('client_id')

df.groupby('client_id').apply(get_result).reset_index(drop=True)

groupby 并应用 function 到 pandas dataframe

问题描述

1 个解决方案

解决方案1
0 2022-08-15 20:02:56

groupby 并应用 function 到 pandas dataframe

问题描述

1 个解决方案

解决方案1 0 2022-08-15 20:02:56

解决方案1
0 2022-08-15 20:02:56