[英]Groupby and apply a function
I would like to groupby by the variable of my df "cod_id" and then apply this function: 我想按我的df“ cod_id”的变量进行分组,然后应用此功能:
[dd.loc[dd['dt_op'].between(d, d + pd.Timedelta(days = 7)), 'quantity'].sum() \
for d in data_1['dt_op']]
Moving from this df: 从此df中移出:
print(dd)
dt_op quantity cod_id
20/01/18 1 613
21/01/18 8 611
21/01/18 1 613
...
To this one: 对此:
print(final_dd)
n = 7
dt_op quantity product_code Final_Quantity
20/01/18 1 613 2
21/01/18 8 611 8
25/01/18 1 613 1
...
I tried with: 我尝试了:
dd.groupby(['cod_id']).apply([dd.loc[dd['dt_op'].between(d, d + pd.Timedelta(days = 7)), 'quantity'].sum() \
for d in data_1['dt_op']])
But it raises: 但它提出了:
TypeError: unhashable type: 'list'
This is a cumbersome but working solution: 这是一个麻烦但可行的解决方案:
def lookforward(x):
L = [x.loc[x['dt_op'].between(row.dt_op, row.dt_op + pd.Timedelta(days=7)), \
'quantity'].sum() for row in x.itertuples(index=False)]
return pd.Series(L, index=x.index)
s = df.groupby('cod_id').apply(lookforward)
s.index = s.index.droplevel(0)
df['Final_Quantity'] = s
print(df)
dt_op quantity cod_id Final_Quantity
0 2018-01-20 1 613 2
1 2018-01-21 8 611 8
2 2018-01-21 1 613 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.