简体   繁体   中英

How to find mean of n previous rows in a column in pandas based on date criteria?

I have a dataset that looks like this:

value1 value2 value3 date

17    21    22     2005-04-01 12:05:00

19    20    24     2005-04-01 12:06:00

16    26    23     2005-04-01 12:07:00

I need to transform it somehow, so the values of each row with date ending with .05:00 (5th minute of each hour) will be equal to average value of previous 60 rows.

I tried to use groupby based on datetime, it does provide average values for each hour (00 - 59), but i need to adjust it for my case.

In the end I would like to have something like this:

  value1 value2 value3 date

  17    21    22     2005-04-01 12:05:00

  19    20    24     2005-04-01 13:05:00

  16    26    23     2005-04-01 14:05:00

where 17 for instance is average of 60 previous values in value1 column.

This will create a rolling mean on 60 minutes windows (makes sure, that date column is datetime64[ns] dtype, if not, convert it beforehand), then you can select the necessary rows with .loc[] :

df.rolling('H', on='date').mean().loc[lambda x: x['date'].dt.minute == 5]

See the docs for further details on .rolling() and .loc[] .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM