简体   繁体   中英

25-23 hour days on pandas dataframe datetime index

I have a pandas dataframe indexed by a datetimeindex. The frequency of the index is variable, but mostly is on a minute-based sampling.

Due to a database problem, dayligth saving time is not properly adressed on the indexing. So, on particular months/days I have duplicated values for the index. Is there a way (without using timezones) to handle 23-25 hour days on pandas so I can keep linear track of time over records?

Here is a small example of my problem:

DatetimeIndex(['2014-03-12 22:59:59', '2014-03-12 22:59:59',
           '2014-03-12 23:00:59', '2014-03-12 23:00:59',
           '2014-03-12 23:01:59', '2014-03-12 23:02:59',
           '2014-03-12 23:02:59', '2014-03-12 23:03:59',
           '2014-03-12 23:03:59', '2014-03-12 23:04:59',
           '2014-03-12 23:04:59', '2014-03-12 23:05:59',
           '2014-03-12 23:06:59', '2014-03-12 23:06:59',
           '2014-03-12 23:07:59', '2014-03-12 23:07:59',
           '2014-03-12 23:08:59', '2014-03-12 23:09:59',
           '2014-03-12 23:09:59', '2014-03-12 23:10:59',
           '2014-03-12 23:10:59', '2014-03-12 23:11:59',
           '2014-03-12 23:11:59', '2014-03-12 23:12:59',
           '2014-03-12 23:13:59', '2014-03-12 23:13:59',
           '2014-03-12 23:14:59', '2014-03-12 23:14:59',
           '2014-03-12 23:15:59', '2014-03-12 23:16:59',
           '2014-03-12 23:16:59', '2014-03-12 23:17:59',
           '2014-03-12 23:17:59', '2014-03-12 23:18:59',
           '2014-03-12 23:19:59', '2014-03-12 23:19:59',
           '2014-03-12 23:20:59', '2014-03-12 23:20:59',
           '2014-03-12 23:21:59', '2014-03-12 23:22:59',
           '2014-03-12 23:22:59', '2014-03-12 23:23:59',
           '2014-03-12 23:24:59', '2014-03-12 23:24:59',
           '2014-03-12 23:25:59', '2014-03-12 23:26:59',
           '2014-03-12 23:26:59', '2014-03-12 23:27:59',
           '2014-03-12 23:27:59', '2014-03-12 23:28:59',
           '2014-03-12 23:28:59', '2014-03-12 23:29:59',
           '2014-03-12 23:30:59', '2014-03-12 23:30:59',
           '2014-03-12 23:31:59', '2014-03-12 23:31:59',
           '2014-03-12 23:32:59', '2014-03-12 23:33:59',
           '2014-03-12 23:33:59', '2014-03-12 23:34:59',
           '2014-03-12 23:34:59', '2014-03-12 23:35:59',
           '2014-03-12 23:36:59', '2014-03-12 23:36:59',
           '2014-03-12 23:37:59', '2014-03-12 23:38:59',
           '2014-03-12 23:38:59', '2014-03-12 23:39:59',
           '2014-03-12 23:40:59', '2014-03-12 23:40:59',
           '2014-03-12 23:41:59', '2014-03-12 23:42:59',
           '2014-03-12 23:42:59', '2014-03-12 23:43:59',
           '2014-03-12 23:44:59', '2014-03-12 23:44:59',
           '2014-03-12 23:45:59', '2014-03-12 23:46:59',
           '2014-03-12 23:46:59', '2014-03-12 23:47:59',
           '2014-03-12 23:48:59', '2014-03-12 23:48:59',
           '2014-03-12 23:49:59', '2014-03-12 23:49:59',
           '2014-03-12 23:50:59', '2014-03-12 23:51:59',
           '2014-03-12 23:51:59', '2014-03-12 23:52:59',
           '2014-03-12 23:52:59', '2014-03-12 23:54:59',
           '2014-03-12 23:56:59', '2014-03-12 23:58:59',
           '2014-03-12 23:54:00', '2014-03-12 23:55:59',
           '2014-03-12 23:56:59', '2014-03-12 23:57:59',
           '2014-03-12 23:59:59'],
          dtype='datetime64[ns]', name='Timestamp', freq=None)  

Your problem is that date index are not mutable so you can't have inplace operations modifying them, you'll have to write over it.

One solution could be to "unroll" the index to still have the same number of time steps but every other timestamp would be pushed forward/backward an hour.

I refer to your index in the OP as index :

import pandas as pd
df = pd.DataFrame(index=index)

first_step = df.index[::2] # every second index

## shift everyone forward starting from the second value, grab every second value ##

second_step = df.index[1::2].shift(periods=1,freq='1H')

new_index = first_step.append(second_step)

df.index = new_index

I can't help but feel that it's weird, tell me if that helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM