繁体   English   中英

dask groupby agg加权平均“未知聚合lambda”错误

[英]dask groupby agg weighted average “unknown aggregate lambda” error

在Dask中,我需要根据第三列,根据两列的分组值来计算加权平均值。 我正在这样做:

dask_df = dd.from_pandas(df, npartitions = 10)
wm = lambda x: np.average(x, weights=dask_df.loc[x.index,"C"])
dask_df = dask_df.groupby(['A', 'B']).agg({'C' : 
wm}).reset_index()
output_df = dask_df.compute()

在熊猫中,我的内存不足。 在达斯克,我得到:

  File "<ipython-input-16-0beb32700c04>", line 3, in <module>
    dask_df = dask_df.groupby(['A', 'B']).agg({'C' : wm}).reset_index()

  File "/anaconda3/lib/python3.7/site-packages/dask/dataframe/groupby.py", line 1555, in agg
    return self.aggregate(arg, split_every=split_every, split_out=split_out)

  File "/anaconda3/lib/python3.7/site-packages/dask/dataframe/groupby.py", line 1550, in aggregate
    arg, split_every=split_every, split_out=split_out

  File "/anaconda3/lib/python3.7/site-packages/dask/dataframe/groupby.py", line 1355, in aggregate
    chunk_funcs, aggregate_funcs, finalizers = _build_agg_args(spec)

  File "/anaconda3/lib/python3.7/site-packages/dask/dataframe/groupby.py", line 659, in _build_agg_args
    impls = _build_agg_args_single(result_column, func, input_column)

  File "/anaconda3/lib/python3.7/site-packages/dask/dataframe/groupby.py", line 703, in _build_agg_args_single
    raise ValueError("unknown aggregate {}".format(func))

ValueError: unknown aggregate lambda

您可能对以下定义的自定义聚合感兴趣: https : //docs.dask.org/en/latest/dataframe-groupby.html#aggregate

显然,该错误消息可以得到改善。 我建议提出一个问题https://github.com/dask/dask/issues/new

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM