繁体   English   中英

Pandas to_datetime 忽略格式

[英]Pandas to_datetime ignore the format

我试图将存储在我的数据框中的日期转换为 DateTime 格式。 我要转换的列的日期以mm/dd/yy格式存储。

这是我用来转换的脚本:

df['dt'] = pd.to_datetime(df['dt'], format = '%d-%m-%Y')

即使提供的格式不正确,脚本也可以准确地转换日期,并且不会出错。

我的问题是为什么在提供错误格式时脚本没有抛出错误?

考虑日期1-2-2020 现在只看日期,你能准确说出它是什么日期吗? 答案是否定的,因为除非您知道日期的格式或日期的创建方式(即日-月-年或月-日-年),否则您无法真正确定上述日期是1st February 2020还是2nd January 2020 因此,这里的关键是验证数据集及其来源。 您可以将多种直觉技术应用于您的数据,例如,如果数据来自美国,则常用日期格式为MM/DD/YYYY或者如果印度则为DD-MM-YY

样本

>>> import pandas as pd
>>> df = pd.DataFrame({'dt': ['1-1-2020', '15-2-2020', '3-24-2020']})
>>> df
          dt
0   1-1-2020
1  15-2-2020
2  3-24-2020

代码 - 按预期抛出错误

>>> pd.to_datetime(df['dt'], format = '%d-%m-%Y')
Traceback (most recent call last):
  File "/home/vishnudev/anaconda3/envs/sumyag/lib/python3.7/site-packages/pandas/core/tools/datetimes.py", line 448, in _convert_listlike_datetimes
    values, tz = conversion.datetime_to_datetime64(arg)
  File "pandas/_libs/tslibs/conversion.pyx", line 200, in pandas._libs.tslibs.conversion.datetime_to_datetime64
TypeError: Unrecognized value type: <class 'str'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/vishnudev/anaconda3/envs/sumyag/lib/python3.7/site-packages/pandas/util/_decorators.py", line 208, in wrapper
    return func(*args, **kwargs)
  File "/home/vishnudev/anaconda3/envs/sumyag/lib/python3.7/site-packages/pandas/core/tools/datetimes.py", line 778, in to_datetime
    values = convert_listlike(arg._values, True, format)
  File "/home/vishnudev/anaconda3/envs/sumyag/lib/python3.7/site-packages/pandas/core/tools/datetimes.py", line 451, in _convert_listlike_datetimes
    raise e
  File "/home/vishnudev/anaconda3/envs/sumyag/lib/python3.7/site-packages/pandas/core/tools/datetimes.py", line 416, in _convert_listlike_datetimes
    arg, format, exact=exact, errors=errors
  File "pandas/_libs/tslibs/strptime.pyx", line 142, in pandas._libs.tslibs.strptime.array_strptime
ValueError: time data '3-24-2020' does not match format '%d-%m-%Y' (match)

下面的代码对我有用:

df['date'] = pd.to_datetime(df['date'], format = '%d-%m-%Y', unit='ns')

或者

df['date'] = pd.to_datetime(df['date'], format = '%d-%m-%Y')
df['date'] = pd.to_datetime(df.date, unit='ns')

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM