简体   繁体   English

如何修复 TypeError:Pandas 中的日期时间 object 无法理解数据类型

[英]How to fix TypeError: data type not understood with a datetime object in Pandas

I am working with a date column in pandas .我正在使用pandas中的date column I have a date column.我有一个日期列。 I want to have just the year and month as a separate column.我只想将年份和月份作为单独的列。

I achieved that by:我通过以下方式实现了这一目标:

df1["month"] = pd.to_datetime(Table_A_df['date']).dt.to_period('M')

Printing it looks like this:打印它看起来像这样:

df1["month"]

Out:

0        2017-03
1        2017-03
2        2017-03
3        2017-03
4        2017-03
          ...   
79638    2018-03
79639    2018-03
79640    2018-03
79641    2018-03
79642    2018-03
Name:   month, Length: 79643, dtype: period[M]

My customer id looks like this:我的客户 ID 如下所示:

0        5094298f068196c5349d43847de5afc9125cf989
1                                             NaN
2                                             NaN
3        433fdf385e33176cf9b0d67ecf383aa928fa261c
4                                             NaN
                           ...                   
79638    6836d8cdd9c6c537c702b35ccd972fae58070004
79639    bbc08d8abad5e699823f2f0021762797941679be
79640    39b5fdd28cb956053d3e4f3f0b884fb95749da8a
79641    3342d5b210274b01e947cc15531ad53fbe25435b
79642    b3f02d0768c0ba8334047d106eb759f3e80517ac
Name: customer_id, Length: 79643, dtype: object

Now trying to groupby customer id and transform the data.现在尝试按客户 ID groupby并转换数据。

user_groups = df1.groupby("customer_id")["month"]

df1["Cohort_month"] = user_groups.transform("min")

I get the following error:我收到以下错误:

TypeError: data type not understood

Complete error:完全错误:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-108-107e17f9a489> in <module>
----> 1 df1["Cohort_month"] = user_groups.transform("min")

C:\Users\Public\Anaconda\lib\site-packages\pandas\core\groupby\generic.py in transform(self, func, *args, **kwargs)
    475         # result to the whole group. Compute func result
    476         # and deal with possible broadcasting below.
--> 477         result = getattr(self, func)(*args, **kwargs)
    478         return self._transform_fast(result, func)
    479 

C:\Users\Public\Anaconda\lib\site-packages\pandas\core\groupby\groupby.py in f(self, **kwargs)
   1375                 # try a cython aggregation if we can
   1376                 try:
-> 1377                     return self._cython_agg_general(alias, alt=npfunc, **kwargs)
   1378                 except DataError:
   1379                     pass

C:\Users\Public\Anaconda\lib\site-packages\pandas\core\groupby\groupby.py in _cython_agg_general(self, how, alt, numeric_only, min_count)
    887 
    888             result, agg_names = self.grouper.aggregate(
--> 889                 obj._values, how, min_count=min_count
    890             )
    891 

C:\Users\Public\Anaconda\lib\site-packages\pandas\core\groupby\ops.py in aggregate(self, values, how, axis, min_count)
    568     ) -> Tuple[np.ndarray, Optional[List[str]]]:
    569         return self._cython_operation(
--> 570             "aggregate", values, how, axis, min_count=min_count
    571         )
    572 

C:\Users\Public\Anaconda\lib\site-packages\pandas\core\groupby\ops.py in _cython_operation(self, kind, values, how, axis, min_count, **kwargs)
    560             result = type(orig_values)(result.astype(np.int64), dtype=orig_values.dtype)
    561         elif is_datetimelike and kind == "aggregate":
--> 562             result = result.astype(orig_values.dtype)
    563 
    564         return result, names

TypeError: data type not understood

This was working before when I had 1 as the day, but when I made it just year and month .这在我之前有 1 作为 day 时有效,但是当我只是yearmonth时。 I am getting an error .我收到一个error Is there a fix around this?有没有解决这个问题?

It's working for the sample you shared, not sure where the issue is, are there any missing values in your month column?它适用于您共享的示例,不确定问题出在哪里,您的月份列中是否有任何缺失值?

df['month'] = pd.to_datetime(df['month']).dt.to_period('M')

user_groups = df.groupby("customer_id")["month"]
df["Cohort_month"] = user_groups.transform("min")
print(df)

                                customer_id    month Cohort_month
0  5094298f068196c5349d43847de5afc9125cf989  2017-03      2017-03
1                                       NaN  2017-03          NaT
2                                       NaN  2017-03          NaT
3  433fdf385e33176cf9b0d67ecf383aa928fa261c  2017-03      2017-03
4                                       NaN  2017-03          NaT

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM