简体   繁体   English

使用自定义格式的pandas to_datetime中的KeyError

[英]KeyError in pandas to_datetime using custom format

The index of my DataFrame (TradeData) is in string format: 我的DataFrame(TradeData)的索引是字符串格式:

In [30]: TradeData.index
Out[30]: Index(['09/30/2013 : 04:14 PM', '09/30/2013 : 03:53 PM', ... ], dtype=object)

And I would like it to be in Datetime. 我希望它能在Datetime。 But the conversion does not seem to work: 但转换似乎不起作用:

In [31]: TradeDataIdxd = pd.to_datetime(TradeData.index, format="%m/%d/%Y : %I:%M %p")
Traceback (most recent call last):

File "<ipython-input-31-1191c22cd132>", line 1, in <module>
TradeDataIdxd = pd.to_datetime(TradeData.index, format="%m/%d/%Y : %I:%M %p")

File "C:\WinPython-64bit-3.3.2.3\python-3.3.2.amd64\lib\site-packages\pandas\tseries\tools.py", line 128, in to_datetime
return _convert_listlike(arg, box=box)

File "C:\WinPython-64bit-3.3.2.3\python-3.3.2.amd64\lib\site-packages\pandas\tseries\tools.py", line 104, in _convert_listlike
result = tslib.array_strptime(arg, format)

File "tslib.pyx", line 1137, in pandas.tslib.array_strptime (pandas\tslib.c:18543)

KeyError: 'p'

None of the elements of TradeData.index are 'p'. TradeData.index的所有元素都不是'p'。 Any ideas what could be the matter? 任何想法可能是什么问题? Thanks in advance. 提前致谢。

You can circumvent this to_datetime issue by resetting the index, manipulating the series via map/lambda/strptime, and then finally setting the index again. 您可以通过重置索引,通过map / lambda / strptime操作系列,然后最终再次设置索引来绕过此to_datetime问题。

In [1058]: TradeData.index
Out[1058]: Index([u'09/30/2013 : 04:14 PM', u'09/30/2013 : 03:53 PM', u'09/30/2013 : 03:53 PM'], dtype=object)

In [1059]: index_name = TradeData.index.name

In [1060]: TradeData = TradeData.reset_index()

In [1061]: TradeData[index_name] = TradeData[index_name].map(lambda x: datetime.strptime(x, "%m/%d/%Y
: %I:%M %p"))

In [1062]: TradeData = TradeData.set_index(index_name)

In [1063]: TradeData.index
Out[1063]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-09-30 16:14:00, ..., 2013-09-30 15:53:00]
Length: 3, Freq: None, Timezone: None

Not quite as concise, but it has the same effect. 不太简洁,但它具有相同的效果。 Or, to package it up in a function: 或者,将其打包成一个函数:

def df_index_to_datetime(df, datetime_format):
    index_name = df.index.name
    df = df.reset_index()
    df[index_name] = df[index_name].map(lambda x: datetime.strptime(x, datetime_format))
    df = df.set_index(index_name)
    return df

A simpler solution would be to fix the string so it matches what to_datetime expects... 一个更简单的解决方案是修复字符串,使其与to_datetime期望的匹配...

from pandas import *
ix = Index(['09/30/2013 : 04:14 PM', '09/30/2013 : 03:53 PM'], dtype=object)
to_datetime(ix.to_series().str.replace(': ',''))

09/30/2013 : 04:14 PM   2013-09-30 16:14:00
09/30/2013 : 03:53 PM   2013-09-30 15:53:00
dtype: datetime64[ns]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM