简体   繁体   中英

How do I label a Pandas datetime row (in a new column) for every 15 minute interval?

I have a pandas dataframe that looks something like this:

Date
31/8/20 4:56:00
31/8/20 4:45:00
31/8/20 4:39:00
31/8/20 4:36:00
31/8/20 4:34:00
31/8/20 4:05:00
31/8/20 4:04:00
31/8/20 3:19:00
31/8/20 2:54:00

I would like it to group rows on a 15-minute basis and label each row if it falls into that 15-minute datetime category. So I'd like my output to look like this:

Date              Interval
31/8/20 4:56:00   Period 1
31/8/20 4:45:00   Period 1
31/8/20 4:39:00   Period 2
31/8/20 4:36:00   Period 2
31/8/20 4:34:00   Period 2
31/8/20 4:05:00   Period 4
31/8/20 4:04:00   Period 4
31/8/20 3:19:00   Period 7
31/8/20 2:54:00   Period 9

As you can see 'Period 1' contains all datetimes within the interval of 31/8/20 4:45 - 31/8/20 4:59:00. Period gaps should occur if there is no corresponding row entry for that period. This is why 'Period 4' is not 'Period 3'

I have tried the following:

df = pd.DataFrame()
df = df.sort_values(by=['Date'], ascending=False)
df['Date'] = pd.to_datetime(df['Date'])
df['Interval'] = 'Period ' + (((df.shift()['Date'] - df['Date']).dt.seconds > 900).cumsum() + 1).astype(str)

However, this returns 'Period n' within the 'Interval' column if the datetime row is within 15-minutes of the previous datetime row.

Date              Interval
31/8/20 4:56:00   Period 1
31/8/20 4:45:00   Period 2
31/8/20 4:39:00   Period 3
31/8/20 4:36:00   Period 4
31/8/20 4:34:00   Period 5
31/8/20 4:05:00   Period 6
31/8/20 4:04:00   Period 7
31/8/20 3:19:00   Period 8
31/8/20 2:54:00   Period 9

Rather I would like to label each row with a 'Period' regardless of the previous rows datetime and based on four 15-minute intervals within the hour such that Periods are sequentially labelled based on

  • dd/mm/yy h:45:00 - h:59:59,
  • dd/mm/yy h:30:00 - h:44:59,
  • dd/mm/yy h:15:00 - h:29:59,
  • dd/mm/yy h:00:00 - h:14:59

Thank you in advance.

Use DataFrame.groupby on Grouper with frequency of 15min , then use groupby.ngroup with optional parameter ascending=False to number each group:

s = df.groupby(pd.Grouper(freq='15min', key='Date')).ngroup(ascending=False).add(1)
df['Interval'] = 'Period ' + s.astype(str)

Result:

                 Date  Interval
0 2020-08-31 04:56:00  Period 1
1 2020-08-31 04:45:00  Period 1
2 2020-08-31 04:39:00  Period 2
3 2020-08-31 04:36:00  Period 2
4 2020-08-31 04:34:00  Period 2
5 2020-08-31 04:05:00  Period 4
6 2020-08-31 04:04:00  Period 4
7 2020-08-31 03:19:00  Period 7
8 2020-08-31 02:54:00  Period 9

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM