[英]Change Pandas index from integer to datetime format
I have a huge size DataFrame that contains index in integer form for date time representation, for example, 20171001
. 我有一个巨大的DataFrame,它包含整数形式的索引,用于表示日期时间,例如
20171001
。 What I'm going to do is to change the form, for example, 20171001
, to the datetime format, '2017-10-01'
. 我要做的是将表单(例如
20171001
)更改为日期时间格式'2017-10-01'
。
For simplicity, I generate such a dataframe. 为简单起见,我生成了这样一个数据帧。
>>> df = pd.DataFrame(np.random.randn(3,2), columns=list('ab'), index=
[20171001,20171002,20171003])
>>> df
a b
20171001 2.205108 0.926963
20171002 1.104884 -0.445450
20171003 0.621504 -0.584352
>>> df.index
Int64Index([20171001, 20171002, 20171003], dtype='int64')
If we apply 'to_datetime' to df.index, we have the weird result: 如果我们将'to_datetime'应用于df.index,我们会得到奇怪的结果:
>>> pd.to_datetime(df.index)
DatetimeIndex(['1970-01-01 00:00:00.020171001',
'1970-01-01 00:00:00.020171002',
'1970-01-01 00:00:00.020171003'],
dtype='datetime64[ns]', freq=None)
What I want is DatetimeIndex(['2017-10-01', '2017-10-02', '2017-10--3'], ...)
How can I manage this problem? 我想要的是
DatetimeIndex(['2017-10-01', '2017-10-02', '2017-10--3'], ...)
我该如何处理这个问题? Note that the file is given. 请注意,该文件已给出。
Use format %Y%m%d
in pd.to_datetime
ie 在
pd.to_datetime
使用format %Y%m%d
即
pd.to_datetime(df.index, format='%Y%m%d')
DatetimeIndex(['2017-10-01', '2017-10-02', '2017-10-03'], dtype='datetime64[ns]', freq=None)
To assign df.index = pd.to_datetime(df.index, format='%Y%m%d')
指定
df.index = pd.to_datetime(df.index, format='%Y%m%d')
pd.to_datetime is the panda way of doing it. pd.to_datetime是熊猫的做法。 But here are two alternatives:
但这里有两种选择:
import datetime
df.index = (datetime.datetime.strptime(str(i),"%Y%m%d") for i in df.index)
or 要么
import datetime
df.index = df.index.map(lambda x: datetime.datetime.strptime(str(x),"%Y%m%d"))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.