簡體   English   中英

如何將 dataframe 列轉換為 UTC 日期時間格式?

[英]How to convert dataframe column into UTC datetime format?

我想將 dataframe data_copy中的Origin列轉換為 UTC 日期時間格式

import pandas as pd

>>>data_copy["Origin"]
 
0       1669-06-04 00:00:00
1       1669-06-22 00:00:00
2       1720-07-15 00:00:00
3       1803-09-01 00:00:00
4       1816-05-26 00:00:00
        
6395    2020-03-29 18:27:36
6396    2020-03-29 18:47:53
6397    2020-03-29 20:05:19
6398    2020-03-30 02:19:27
6399    2020-03-30 06:11:36

還有一些時間為00:00:00的數據條目(我也需要轉換它)我試過這個命令data_copy["Origin"] = pd.to_datetime(data_copy["Origin"],infer_datetime_format=True)但我是得到這樣的錯誤

Traceback (most recent call last):

  File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\arrays\datetimes.py", line 2054, in objects_to_datetime64ns
    values, tz_parsed = conversion.datetime_to_datetime64(data)

  File "pandas\_libs\tslibs\conversion.pyx", line 350, in pandas._libs.tslibs.conversion.datetime_to_datetime64

TypeError: Unrecognized value type: <class 'str'>


During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "<ipython-input-93-aead2d23f264>", line 1, in <module>
    data_copy["Origin"] = pd.to_datetime(data_copy["Origin"],infer_datetime_format=True)

  File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\tools\datetimes.py", line 803, in to_datetime
    values = convert_listlike(arg._values, format)

  File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\tools\datetimes.py", line 466, in _convert_listlike_datetimes
    allow_object=True,

  File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\arrays\datetimes.py", line 2059, in objects_to_datetime64ns
    raise e

  File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\arrays\datetimes.py", line 2050, in objects_to_datetime64ns
    require_iso8601=require_iso8601,

  File "pandas\_libs\tslib.pyx", line 352, in pandas._libs.tslib.array_to_datetime

  File "pandas\_libs\tslib.pyx", line 574, in pandas._libs.tslib.array_to_datetime

  File "pandas\_libs\tslib.pyx", line 570, in pandas._libs.tslib.array_to_datetime

  File "pandas\_libs\tslib.pyx", line 546, in pandas._libs.tslib.array_to_datetime

  File "pandas\_libs\tslibs\np_datetime.pyx", line 113, in pandas._libs.tslibs.np_datetime.check_dts_bounds

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1669-06-04 00:00:00

如何將列轉換為 UTC 日期時間格式?

這是 pandas 鏈接中的問題日期時間超出限制:

在 [92] 中:pd.Timestamp.min
出[92]:時間戳('1677-09-21 00:12:43.145225')

在 [93] 中:pd.Timestamp.max
出[93]:時間戳('2262-04-11 23:47:16.854775807')

可能的解決方案是通過errors='coerce'參數將值替換為NaT

data_copy["Origin"] = pd.to_datetime(data_copy["Origin"],
                                     infer_datetime_format=True, 
                                     errors='coerce')

如果您仍然需要日期時間,您可以使用 Python 的日期時間 class。 但是,這會給您留下 dtype object 的列,這意味着 pandas 的日期時間功能(dt 訪問器)不可用。 前任:

from datetime import datetime, timezone
import pandas as pd

s = (pd.Series(["1669-06-04 00:00:00", "1816-05-26 00:00:00", "2020-03-29 18:27:36"])
                 .apply(lambda t: datetime.fromisoformat(t).replace(tzinfo=timezone.utc)))

# s
# 0    1669-06-04 00:00:00+00:00
# 1    1816-05-26 00:00:00+00:00
# 2    2020-03-29 18:27:36+00:00
# dtype: object

您仍然可以訪問 datetime 類的方法,但這需要迭代( apply )。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM