如何修复 TypeError：Pandas 中的日期时间 object 无法理解数据类型

Question

我正在使用pandas中的date column 。 我有一个日期列。 我只想将年份和月份作为单独的列。

我通过以下方式实现了这一目标：

df1["month"] = pd.to_datetime(Table_A_df['date']).dt.to_period('M')

打印它看起来像这样：

df1["month"]

Out:

0        2017-03
1        2017-03
2        2017-03
3        2017-03
4        2017-03
          ...   
79638    2018-03
79639    2018-03
79640    2018-03
79641    2018-03
79642    2018-03
Name:   month, Length: 79643, dtype: period[M]

我的客户 ID 如下所示：

0        5094298f068196c5349d43847de5afc9125cf989
1                                             NaN
2                                             NaN
3        433fdf385e33176cf9b0d67ecf383aa928fa261c
4                                             NaN
                           ...                   
79638    6836d8cdd9c6c537c702b35ccd972fae58070004
79639    bbc08d8abad5e699823f2f0021762797941679be
79640    39b5fdd28cb956053d3e4f3f0b884fb95749da8a
79641    3342d5b210274b01e947cc15531ad53fbe25435b
79642    b3f02d0768c0ba8334047d106eb759f3e80517ac
Name: customer_id, Length: 79643, dtype: object

现在尝试按客户 ID groupby并转换数据。

user_groups = df1.groupby("customer_id")["month"]

df1["Cohort_month"] = user_groups.transform("min")

我收到以下错误：

TypeError: data type not understood

完全错误：

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-108-107e17f9a489> in <module>
----> 1 df1["Cohort_month"] = user_groups.transform("min")

C:\Users\Public\Anaconda\lib\site-packages\pandas\core\groupby\generic.py in transform(self, func, *args, **kwargs)
    475         # result to the whole group. Compute func result
    476         # and deal with possible broadcasting below.
--> 477         result = getattr(self, func)(*args, **kwargs)
    478         return self._transform_fast(result, func)
    479 

C:\Users\Public\Anaconda\lib\site-packages\pandas\core\groupby\groupby.py in f(self, **kwargs)
   1375                 # try a cython aggregation if we can
   1376                 try:
-> 1377                     return self._cython_agg_general(alias, alt=npfunc, **kwargs)
   1378                 except DataError:
   1379                     pass

C:\Users\Public\Anaconda\lib\site-packages\pandas\core\groupby\groupby.py in _cython_agg_general(self, how, alt, numeric_only, min_count)
    887 
    888             result, agg_names = self.grouper.aggregate(
--> 889                 obj._values, how, min_count=min_count
    890             )
    891 

C:\Users\Public\Anaconda\lib\site-packages\pandas\core\groupby\ops.py in aggregate(self, values, how, axis, min_count)
    568     ) -> Tuple[np.ndarray, Optional[List[str]]]:
    569         return self._cython_operation(
--> 570             "aggregate", values, how, axis, min_count=min_count
    571         )
    572 

C:\Users\Public\Anaconda\lib\site-packages\pandas\core\groupby\ops.py in _cython_operation(self, kind, values, how, axis, min_count, **kwargs)
    560             result = type(orig_values)(result.astype(np.int64), dtype=orig_values.dtype)
    561         elif is_datetimelike and kind == "aggregate":
--> 562             result = result.astype(orig_values.dtype)
    563 
    564         return result, names

TypeError: data type not understood

这在我之前有 1 作为 day 时有效，但是当我只是year和month时。 我收到一个error 。 有没有解决这个问题？

Answer 1

它适用于您共享的示例，不确定问题出在哪里，您的月份列中是否有任何缺失值？

df['month'] = pd.to_datetime(df['month']).dt.to_period('M')

user_groups = df.groupby("customer_id")["month"]
df["Cohort_month"] = user_groups.transform("min")
print(df)

                                customer_id    month Cohort_month
0  5094298f068196c5349d43847de5afc9125cf989  2017-03      2017-03
1                                       NaN  2017-03          NaT
2                                       NaN  2017-03          NaT
3  433fdf385e33176cf9b0d67ecf383aa928fa261c  2017-03      2017-03
4                                       NaN  2017-03          NaT

如何修复 TypeError：Pandas 中的日期时间 object 无法理解数据类型

问题描述

1 个解决方案

解决方案1
0 2020-06-13 20:10:07

如何修复 TypeError：Pandas 中的日期时间 object 无法理解数据类型

问题描述

1 个解决方案

解决方案1 0 2020-06-13 20:10:07

解决方案1
0 2020-06-13 20:10:07