简体   繁体   English

如何从 Python 中的 dataframe 中的 DateTimeIndex 中删除微秒?

[英]How to remove microseconds from DateTimeIndex in dataframe in Python?

I want to remove the microseconds from index.我想从索引中删除微秒。

My index is like this:我的索引是这样的:

DatetimeIndex(['2003-11-20 13:07:40.895000+00:00',
           '2003-11-20 13:16:13.039000+00:00',
           '2003-11-20 13:24:44.868000+00:00',
           '2003-11-20 13:33:17.013000+00:00',
           '2003-11-20 13:41:49.158000+00:00',
           '2003-11-20 13:50:20.987000+00:00',
           '2003-11-20 13:58:53.132000+00:00',
           '2003-11-20 14:07:24.961000+00:00',
           '2003-11-20 14:15:57.106000+00:00',
           '2003-11-20 14:24:28.935000+00:00',
           ...
           '2003-12-04 19:28:56.025000+00:00',
           '2003-12-04 19:37:27.854000+00:00',
           '2003-12-04 19:45:59.999000+00:00',
           '2003-12-04 19:54:32.143000+00:00',
           '2003-12-04 20:03:03.972000+00:00',
           '2003-12-04 20:11:36.117000+00:00',
           '2003-12-04 20:20:07.946000+00:00',
           '2003-12-04 20:28:40.091000+00:00',
           '2003-12-04 20:37:11.920000+00:00',
           '2003-12-04 20:45:44.065000+00:00'],
          dtype='datetime64[ns, UTC]'

And I want to remove the microseconds in order to have something like this only: '2003-12-04 20:45:44' I do not want to convert it to string, as it is needed to remain datetime because it is the index of the dataframe.而且我想删除微秒,以便仅获得类似的内容: '2003-12-04 20:45:44'我不想将其转换为字符串,因为它需要保留日期时间,因为它是索引dataframe 的。 I have been searching for this, but I only found this, which does not work:我一直在寻找这个,但我只找到了这个,它不起作用:

df.index.replace(microsecond=0, inplace = True)

Can you help me please?你能帮我吗?

Given a pd.DateTimeIndex with timezone information and millisecond data like this:给定带有时区信息和毫秒数据的 pd.DateTimeIndex,如下所示:

didx = pd.DatetimeIndex(['2003-11-20 13:07:40.895000+00:00',
           '2003-11-20 13:16:13.039000+00:00',
           '2003-11-20 13:24:44.868000+00:00',
           '2003-11-20 13:33:17.013000+00:00',
           '2003-11-20 13:41:49.158000+00:00',
           '2003-11-20 13:50:20.987000+00:00',
           '2003-11-20 13:58:53.132000+00:00',
           '2003-11-20 14:07:24.961000+00:00',
           '2003-11-20 14:15:57.106000+00:00',
           '2003-11-20 14:24:28.935000+00:00',
           '2003-12-04 19:28:56.025000+00:00',
           '2003-12-04 19:37:27.854000+00:00',
           '2003-12-04 19:45:59.999000+00:00',
           '2003-12-04 19:54:32.143000+00:00',
           '2003-12-04 20:03:03.972000+00:00',
           '2003-12-04 20:11:36.117000+00:00',
           '2003-12-04 20:20:07.946000+00:00',
           '2003-12-04 20:28:40.091000+00:00',
           '2003-12-04 20:37:11.920000+00:00',
           '2003-12-04 20:45:44.065000+00:00'],
          dtype='datetime64[ns, UTC]')

You can use pd.DateTimeIndex.floor and tz_localize(None) , to truncate timestamps to seconds and remove the timezone information.您可以使用pd.DateTimeIndex.floortz_localize(None)将时间戳截断为秒并删除时区信息。

didx.floor('S').tz_localize(None)

Output: Output:

DatetimeIndex(['2003-11-20 13:07:40', '2003-11-20 13:16:13',
               '2003-11-20 13:24:44', '2003-11-20 13:33:17',
               '2003-11-20 13:41:49', '2003-11-20 13:50:20',
               '2003-11-20 13:58:53', '2003-11-20 14:07:24',
               '2003-11-20 14:15:57', '2003-11-20 14:24:28',
               '2003-12-04 19:28:56', '2003-12-04 19:37:27',
               '2003-12-04 19:45:59', '2003-12-04 19:54:32',
               '2003-12-04 20:03:03', '2003-12-04 20:11:36',
               '2003-12-04 20:20:07', '2003-12-04 20:28:40',
               '2003-12-04 20:37:11', '2003-12-04 20:45:44'],
              dtype='datetime64[ns]', freq=None)

You should be able to use .strftime('%Y-%m-%d %H:%M:%S') on each.您应该能够在每个上使用.strftime('%Y-%m-%d %H:%M:%S')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM