![](/img/trans.png)
[英]python pandas weighted average with the use of groupby agg()
[英]dask groupby agg weighted average “unknown aggregate lambda” error
在Dask中,我需要根据第三列,根据两列的分组值来计算加权平均值。 我正在这样做:
dask_df = dd.from_pandas(df, npartitions = 10)
wm = lambda x: np.average(x, weights=dask_df.loc[x.index,"C"])
dask_df = dask_df.groupby(['A', 'B']).agg({'C' :
wm}).reset_index()
output_df = dask_df.compute()
在熊猫中,我的内存不足。 在达斯克,我得到:
File "<ipython-input-16-0beb32700c04>", line 3, in <module>
dask_df = dask_df.groupby(['A', 'B']).agg({'C' : wm}).reset_index()
File "/anaconda3/lib/python3.7/site-packages/dask/dataframe/groupby.py", line 1555, in agg
return self.aggregate(arg, split_every=split_every, split_out=split_out)
File "/anaconda3/lib/python3.7/site-packages/dask/dataframe/groupby.py", line 1550, in aggregate
arg, split_every=split_every, split_out=split_out
File "/anaconda3/lib/python3.7/site-packages/dask/dataframe/groupby.py", line 1355, in aggregate
chunk_funcs, aggregate_funcs, finalizers = _build_agg_args(spec)
File "/anaconda3/lib/python3.7/site-packages/dask/dataframe/groupby.py", line 659, in _build_agg_args
impls = _build_agg_args_single(result_column, func, input_column)
File "/anaconda3/lib/python3.7/site-packages/dask/dataframe/groupby.py", line 703, in _build_agg_args_single
raise ValueError("unknown aggregate {}".format(func))
ValueError: unknown aggregate lambda
您可能对以下定义的自定义聚合感兴趣: https : //docs.dask.org/en/latest/dataframe-groupby.html#aggregate
显然,该错误消息可以得到改善。 我建议提出一个问题https://github.com/dask/dask/issues/new
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.