繁体   English   中英

Pandas SparseDtype 不适用于 GroupBy

[英]Pandas SparseDtype not working with GroupBy

data.groupby(by="DAY").agg({"CLOSING_DATE": min})

为什么当我尝试对我的 dataframe 进行分组以获取稀疏列的最旧日期(CLOSING_DATE 大部分为空)时,我收到以下错误?

Traceback (most recent call last):
File "<ipython-input-23-37f9fe161304>", line 1, in <module>
data[:10000].groupby(by="DAY").agg({"CLOSING_DATE": min})
File "/home/user/miniconda3/envs/churn/lib/python3.8/site-packages/pandas/core/groupby/generic.py", line 951, in aggregate
result, how = self._aggregate(func, *args, **kwargs)
File "/home/user/miniconda3/envs/py_env/lib/python3.8/site-packages/pandas/core/base.py", line 416, in _aggregate
result = _agg(arg, _agg_1dim)
File "/home/user/miniconda3/envs/py_env/lib/python3.8/site-packages/pandas/core/base.py", line 383, in _agg
result[fname] = func(fname, agg_how)
File "/home/user/miniconda3/envs/py_env/lib/python3.8/site-packages/pandas/core/base.py", line 367, in _agg_1dim
return colg.aggregate(how)
File "/home/user/miniconda3/envs/py_env/lib/python3.8/site-packages/pandas/core/groupby/generic.py", line 252, in aggregate
return getattr(self, cyfunc)()
File "/home/user/miniconda3/envs/py_env/lib/python3.8/site-packages/pandas/core/groupby/groupby.py", line 1553, in min
return self._agg_general(
File "/home/user/miniconda3/envs/py_env/lib/python3.8/site-packages/pandas/core/groupby/groupby.py", line 1000, in _agg_general
result = self._cython_agg_general(
File "/home/user/miniconda3/envs/py_env/lib/python3.8/site-packages/pandas/core/groupby/groupby.py", line 1035, in _cython_agg_general
result, agg_names = self.grouper.aggregate(
File "/home/user/miniconda3/envs/py_env/lib/python3.8/site-packages/pandas/core/groupby/ops.py", line 591, in aggregate
return self._cython_operation(
File "/home/user/miniconda3/envs/py_env/lib/python3.8/site-packages/pandas/core/groupby/ops.py", line 471, in _cython_operation
raise NotImplementedError(f"{values.dtype} dtype not supported")
NotImplementedError: Sparse[float64, nan] dtype not supported

这是 pandas 中的一个错误,与最近对 cython 优化的 groupbys 的重构有关: https://github.com/pandas-dev/pandas/issues/38980

你有两个选择:

  1. 将您正在使用的 pandas 版本降级到 1.1.4 并等待错误修复(可能约 4-6 周)
  2. 使用to_dense()在 groupby 之前将稀疏矩阵转换为密集矩阵

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM