[英]How to fix TypeError: data type not understood with a datetime object in Pandas
我正在使用pandas
中的date
column
。 我有一个日期列。 我只想将年份和月份作为单独的列。
我通过以下方式实现了这一目标:
df1["month"] = pd.to_datetime(Table_A_df['date']).dt.to_period('M')
打印它看起来像这样:
df1["month"]
Out:
0 2017-03
1 2017-03
2 2017-03
3 2017-03
4 2017-03
...
79638 2018-03
79639 2018-03
79640 2018-03
79641 2018-03
79642 2018-03
Name: month, Length: 79643, dtype: period[M]
我的客户 ID 如下所示:
0 5094298f068196c5349d43847de5afc9125cf989
1 NaN
2 NaN
3 433fdf385e33176cf9b0d67ecf383aa928fa261c
4 NaN
...
79638 6836d8cdd9c6c537c702b35ccd972fae58070004
79639 bbc08d8abad5e699823f2f0021762797941679be
79640 39b5fdd28cb956053d3e4f3f0b884fb95749da8a
79641 3342d5b210274b01e947cc15531ad53fbe25435b
79642 b3f02d0768c0ba8334047d106eb759f3e80517ac
Name: customer_id, Length: 79643, dtype: object
现在尝试按客户 ID groupby
并转换数据。
user_groups = df1.groupby("customer_id")["month"]
df1["Cohort_month"] = user_groups.transform("min")
我收到以下错误:
TypeError: data type not understood
完全错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-108-107e17f9a489> in <module>
----> 1 df1["Cohort_month"] = user_groups.transform("min")
C:\Users\Public\Anaconda\lib\site-packages\pandas\core\groupby\generic.py in transform(self, func, *args, **kwargs)
475 # result to the whole group. Compute func result
476 # and deal with possible broadcasting below.
--> 477 result = getattr(self, func)(*args, **kwargs)
478 return self._transform_fast(result, func)
479
C:\Users\Public\Anaconda\lib\site-packages\pandas\core\groupby\groupby.py in f(self, **kwargs)
1375 # try a cython aggregation if we can
1376 try:
-> 1377 return self._cython_agg_general(alias, alt=npfunc, **kwargs)
1378 except DataError:
1379 pass
C:\Users\Public\Anaconda\lib\site-packages\pandas\core\groupby\groupby.py in _cython_agg_general(self, how, alt, numeric_only, min_count)
887
888 result, agg_names = self.grouper.aggregate(
--> 889 obj._values, how, min_count=min_count
890 )
891
C:\Users\Public\Anaconda\lib\site-packages\pandas\core\groupby\ops.py in aggregate(self, values, how, axis, min_count)
568 ) -> Tuple[np.ndarray, Optional[List[str]]]:
569 return self._cython_operation(
--> 570 "aggregate", values, how, axis, min_count=min_count
571 )
572
C:\Users\Public\Anaconda\lib\site-packages\pandas\core\groupby\ops.py in _cython_operation(self, kind, values, how, axis, min_count, **kwargs)
560 result = type(orig_values)(result.astype(np.int64), dtype=orig_values.dtype)
561 elif is_datetimelike and kind == "aggregate":
--> 562 result = result.astype(orig_values.dtype)
563
564 return result, names
TypeError: data type not understood
这在我之前有 1 作为 day 时有效,但是当我只是year
和month
时。 我收到一个error
。 有没有解决这个问题?
它适用于您共享的示例,不确定问题出在哪里,您的月份列中是否有任何缺失值?
df['month'] = pd.to_datetime(df['month']).dt.to_period('M')
user_groups = df.groupby("customer_id")["month"]
df["Cohort_month"] = user_groups.transform("min")
print(df)
customer_id month Cohort_month
0 5094298f068196c5349d43847de5afc9125cf989 2017-03 2017-03
1 NaN 2017-03 NaT
2 NaN 2017-03 NaT
3 433fdf385e33176cf9b0d67ecf383aa928fa261c 2017-03 2017-03
4 NaN 2017-03 NaT
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.