简体   繁体   中英

Pandas datetime index ceil to specific hour in day

I have a datetime index which I would like to round up(ceil) to a specific hour in the day. I am already aware of pandas' offset aliases and how they work, but specifically I would like to tell it to round the datetime to a specific hour in the day(or a specific day in the month). For example I would like to have this kind of transformation:

print(results.index)
DatetimeIndex(['2018-12-14 05:00:00+01:00', '2018-12-14 06:00:00+01:00',
           '2018-12-14 07:00:00+01:00', '2018-12-14 08:00:00+01:00',
           '2018-12-14 09:00:00+01:00', '2018-12-14 10:00:00+01:00',
           '2018-12-14 11:00:00+01:00', '2018-12-14 12:00:00+01:00',
           '2018-12-14 13:00:00+01:00', '2018-12-14 14:00:00+01:00',

Turns into

DatetimeIndex(['2018-12-14 08:00:00+01:00', '2018-12-14 08:00:00+01:00',
           '2018-12-14 08:00:00+01:00', '2018-12-14 08:00:00+01:00',
           '2018-12-15 08:00:00+01:00', '2018-12-15 08:00:00+01:00',
           '2018-12-15 08:00:00+01:00', '2018-12-15 08:00:00+01:00',
           '2018-12-15 08:00:00+01:00', '2018-12-15 08:00:00+01:00',

As far as I'm aware, there does not exist such a parameter that we can pass to ceil() that would allow this, since we can only round to nearest hour, day, month(freq='H', 'D', 'M')... Is there an elegant solution to this or would I have to code my own for loop?

One idea is use numpy.where and offsets.DateOffset - here hour without s means set values to 8 , day with s means add one day to original days:

d = pd.DatetimeIndex(['2018-12-14 05:00:00+01:00', '2018-12-14 06:00:00+01:00',
                      '2018-12-14 07:00:00+01:00', '2018-12-14 08:00:00+01:00',
                      '2018-12-14 09:00:00+01:00', '2018-12-14 10:00:00+01:00',
                      '2018-12-14 11:00:00+01:00', '2018-12-14 12:00:00+01:00',
                      '2018-12-14 13:00:00+01:00', '2018-12-14 14:00:00+01:00'])
           
results = pd.DataFrame(index=d)

out = np.where(results.index.hour <= 8, 
               results.index + pd.offsets.DateOffset(hour=8), 
               results.index + pd.offsets.DateOffset(days=1, hour=8))

print (pd.DatetimeIndex(out))
DatetimeIndex(['2018-12-14 08:00:00+01:00', '2018-12-14 08:00:00+01:00',
               '2018-12-14 08:00:00+01:00', '2018-12-14 08:00:00+01:00',
               '2018-12-15 08:00:00+01:00', '2018-12-15 08:00:00+01:00',
               '2018-12-15 08:00:00+01:00', '2018-12-15 08:00:00+01:00',
               '2018-12-15 08:00:00+01:00', '2018-12-15 08:00:00+01:00'],
              dtype='datetime64[ns, pytz.FixedOffset(60)]', freq=None)

Another idea is use Timedelta s and add day only if condition is True :

m = results.index.hour > 8
out = results.index + pd.offsets.DateOffset(hour=8)  + pd.Timedelta(days=1) * m
print (out)
DatetimeIndex(['2018-12-14 08:00:00+01:00', '2018-12-14 08:00:00+01:00',
               '2018-12-14 08:00:00+01:00', '2018-12-14 08:00:00+01:00',
               '2018-12-15 08:00:00+01:00', '2018-12-15 08:00:00+01:00',
               '2018-12-15 08:00:00+01:00', '2018-12-15 08:00:00+01:00',
               '2018-12-15 08:00:00+01:00', '2018-12-15 08:00:00+01:00'],
              dtype='datetime64[ns, pytz.FixedOffset(60)]', freq=None)

m = results.index.hour > 8
out = results.index.floor('d') + pd.Timedelta(hours=8)  + pd.Timedelta(days=1) * m
print (out)
DatetimeIndex(['2018-12-14 08:00:00+01:00', '2018-12-14 08:00:00+01:00',
               '2018-12-14 08:00:00+01:00', '2018-12-14 08:00:00+01:00',
               '2018-12-15 08:00:00+01:00', '2018-12-15 08:00:00+01:00',
               '2018-12-15 08:00:00+01:00', '2018-12-15 08:00:00+01:00',
               '2018-12-15 08:00:00+01:00', '2018-12-15 08:00:00+01:00'],
              dtype='datetime64[ns, pytz.FixedOffset(60)]', freq=None)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM