简体   繁体   English

如何将 dataframe 列转换为 UTC 日期时间格式?

[英]How to convert dataframe column into UTC datetime format?

I want to convert this Origin column in the dataframe data_copy to UTC datetime format我想将 dataframe data_copy中的Origin列转换为 UTC 日期时间格式

import pandas as pd

>>>data_copy["Origin"]
 
0       1669-06-04 00:00:00
1       1669-06-22 00:00:00
2       1720-07-15 00:00:00
3       1803-09-01 00:00:00
4       1816-05-26 00:00:00
        
6395    2020-03-29 18:27:36
6396    2020-03-29 18:47:53
6397    2020-03-29 20:05:19
6398    2020-03-30 02:19:27
6399    2020-03-30 06:11:36

There is also some data entries with 00:00:00 Time (I need to convert this also) I tried this command data_copy["Origin"] = pd.to_datetime(data_copy["Origin"],infer_datetime_format=True) But I am getting error like this还有一些时间为00:00:00的数据条目(我也需要转换它)我试过这个命令data_copy["Origin"] = pd.to_datetime(data_copy["Origin"],infer_datetime_format=True)但我是得到这样的错误

Traceback (most recent call last):

  File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\arrays\datetimes.py", line 2054, in objects_to_datetime64ns
    values, tz_parsed = conversion.datetime_to_datetime64(data)

  File "pandas\_libs\tslibs\conversion.pyx", line 350, in pandas._libs.tslibs.conversion.datetime_to_datetime64

TypeError: Unrecognized value type: <class 'str'>


During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "<ipython-input-93-aead2d23f264>", line 1, in <module>
    data_copy["Origin"] = pd.to_datetime(data_copy["Origin"],infer_datetime_format=True)

  File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\tools\datetimes.py", line 803, in to_datetime
    values = convert_listlike(arg._values, format)

  File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\tools\datetimes.py", line 466, in _convert_listlike_datetimes
    allow_object=True,

  File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\arrays\datetimes.py", line 2059, in objects_to_datetime64ns
    raise e

  File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\arrays\datetimes.py", line 2050, in objects_to_datetime64ns
    require_iso8601=require_iso8601,

  File "pandas\_libs\tslib.pyx", line 352, in pandas._libs.tslib.array_to_datetime

  File "pandas\_libs\tslib.pyx", line 574, in pandas._libs.tslib.array_to_datetime

  File "pandas\_libs\tslib.pyx", line 570, in pandas._libs.tslib.array_to_datetime

  File "pandas\_libs\tslib.pyx", line 546, in pandas._libs.tslib.array_to_datetime

  File "pandas\_libs\tslibs\np_datetime.pyx", line 113, in pandas._libs.tslibs.np_datetime.check_dts_bounds

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1669-06-04 00:00:00

How could I convert the column into UTC datetime format?如何将列转换为 UTC 日期时间格式?

Here is problem datetimes are outside limits in pandas link :这是 pandas 链接中的问题日期时间超出限制:

In [92]: pd.Timestamp.min在 [92] 中:pd.Timestamp.min
Out[92]: Timestamp('1677-09-21 00:12:43.145225')出[92]:时间戳('1677-09-21 00:12:43.145225')

In [93]: pd.Timestamp.max在 [93] 中:pd.Timestamp.max
Out[93]: Timestamp('2262-04-11 23:47:16.854775807')出[93]:时间戳('2262-04-11 23:47:16.854775807')

Possible solution is replace values to NaT by errors='coerce' parameter:可能的解决方案是通过errors='coerce'参数将值替换为NaT

data_copy["Origin"] = pd.to_datetime(data_copy["Origin"],
                                     infer_datetime_format=True, 
                                     errors='coerce')

If you need the datetimes anyway, you can use Python's datetime class.如果您仍然需要日期时间,您可以使用 Python 的日期时间 class。 However, this will leave you with columns of dtype object, meaning that pandas' datetime functionality (dt accessor) is not available.但是,这会给您留下 dtype object 的列,这意味着 pandas 的日期时间功能(dt 访问器)不可用。 Ex:前任:

from datetime import datetime, timezone
import pandas as pd

s = (pd.Series(["1669-06-04 00:00:00", "1816-05-26 00:00:00", "2020-03-29 18:27:36"])
                 .apply(lambda t: datetime.fromisoformat(t).replace(tzinfo=timezone.utc)))

# s
# 0    1669-06-04 00:00:00+00:00
# 1    1816-05-26 00:00:00+00:00
# 2    2020-03-29 18:27:36+00:00
# dtype: object

You can still access the datetime class' methods but this then requires iteration ( apply ).您仍然可以访问 datetime 类的方法,但这需要迭代( apply )。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM