I've got a dataframe 'activity_level'. Here is the column I want to use:
activity_level['MP']
Date_and_time
2020-07-24 21:00:00 0.0
2020-07-24 21:01:00 0.0
2020-07-24 21:02:00 0.0
2020-07-24 21:03:00 0.0
2020-07-24 21:04:00 0.0
2020-07-24 21:05:00 0.0
2020-07-24 21:06:00 0.0
2020-07-24 21:07:00 0.0
2020-07-24 21:08:00 0.0
2020-07-24 21:09:00 0.0
2020-07-24 21:10:00 0.0
2020-07-24 21:11:00 0.0
2020-07-24 21:12:00 0.0
2020-07-24 21:13:00 0.0
2020-07-24 21:14:00 0.0
2020-07-24 21:15:00 0.0
2020-07-24 21:16:00 0.0
2020-07-24 21:17:00 0.0
2020-07-24 21:18:00 0.0
2020-07-24 21:19:00 0.0
2020-07-24 21:20:00 0.0
2020-07-24 21:21:00 0.0
2020-07-24 21:22:00 0.0
2020-07-24 21:23:00 0.0
2020-07-24 21:24:00 0.0
2020-07-24 21:25:00 0.0
2020-07-24 21:26:00 0.0
2020-07-24 21:27:00 0.0
2020-07-24 21:28:00 0.0
2020-07-24 21:29:00 0.0
2020-07-24 21:30:00 0.0
2020-07-24 21:31:00 0.0
2020-07-24 21:32:00 0.0
2020-07-24 21:33:00 0.0
2020-07-24 21:34:00 0.0
2020-07-24 21:35:00 0.0
2020-07-24 21:36:00 0.0
2020-07-24 21:37:00 0.0
2020-07-24 21:38:00 0.0
2020-07-24 21:39:00 0.0
2020-07-24 21:40:00 0.0
2020-07-24 21:41:00 0.0
2020-07-24 21:42:00 0.0
2020-07-24 21:43:00 0.0
2020-07-24 21:44:00 0.0
2020-07-24 21:45:00 0.0
2020-07-24 21:46:00 0.0
2020-07-24 21:47:00 0.0
2020-07-24 21:48:00 0.0
2020-07-24 21:49:00 0.0
2020-07-24 21:50:00 0.0
2020-07-24 21:51:00 0.0
2020-07-24 21:52:00 0.0
2020-07-24 21:53:00 0.0
2020-07-24 21:54:00 0.0
2020-07-24 21:55:00 0.0
2020-07-24 21:56:00 0.0
2020-07-24 21:57:00 0.0
2020-07-24 21:58:00 0.0
2020-07-24 21:59:00 0.0
Name: MP, dtype: float64
I want to calculate the 3-minutely mean and assign a zero to the 1-minutely value if the 3-minutely mean exceeds 15. So for the first 3 values in activity_level['MP'] the mean is 0. So now I want to assign a zero to the first 3 values in activity_level['MP']. I have created an empty column to fill in zeroes or ones in this column.
I've tried the following, but I can't get it to work right:
#create empty column
activity_level['walking_frame'] = ""
#calculate 3-minutely mean
walking_activity = activity_level.resample('180s').mean()
#create linspaced vector to loop over
vector = np.linspace(0,60,20,endpoint=False).tolist()
vector = [ int(x) for x in vector ]
activity_level2 = activity_level.copy()
#loop to fill in zeroes or ones in empty column
for MP_id,MP in enumerate(walking_activity['MP']):
if MP > 15:
activity_level2['walking_frame'][vector[MP_id]:vector[MP_id+1]] == 1
else:
activity_level2['walking_frame'][vector[MP_id]:vector[MP_id+1]] == 0
Any help would be much appreciated!
So, in my understanding, you are looking for a rolling mean and a shifted result on it. Just for the future, be welcoming and provide a DF definition and an example that would actually show some results regarding your test. You have 60 lines that would never trigger your intended action. Hence, for the fun of it, I've used the fibonacci sequence as values.
This solution assumes, that your data will be minutly values as you have displayed in your example.
import pandas as pd
# create a test dataframe
test_df = pd.DataFrame({
"Date_and_time" : ["2020-07-24 21:00:00", "2020-07-24 21:01:00 ", "2020-07-24 21:02:00",
"2020-07-24 21:03:00", "2020-07-24 21:04:00", "2020-07-24 21:05:00", "2020-07-24 21:06:00",
"2020-07-24 21:07:00", "2020-07-24 21:08:00", "2020-07-24 21:09:00"],
"value" : [0.0, 1.0, 1.0, 2.0, 3.0, 5.0, 8.0, 13.0, 21.0, 34.0]
})
# cast the times as datetims
test_df = test_df.assign(Date_and_time = lambda x : pd.to_datetime(x.Date_and_time))
# this is all you need, above is just setup
res = (
test_df
.assign(
# create a rolling mean
rolling_mean = lambda x : x.value.rolling(3).mean(),
# create an indicator ("alter"), if rolling mean 2 minutes
# later is greater than 15
alert = lambda x : x.rolling_mean.shift(-2) >= 15)
)
print(res)
And the output looks like this (the rolling mean of minute 7 through 9 is 22.66 and therefore, you are alerted at minute 7):
Date_and_time value rolling_mean alert
0 2020-07-24 21:00:00 0.0 NaN False
1 2020-07-24 21:01:00 1.0 NaN False
2 2020-07-24 21:02:00 1.0 0.666667 False
3 2020-07-24 21:03:00 2.0 1.333333 False
4 2020-07-24 21:04:00 3.0 2.000000 False
5 2020-07-24 21:05:00 5.0 3.333333 False
6 2020-07-24 21:06:00 8.0 5.333333 False
7 2020-07-24 21:07:00 13.0 8.666667 True
8 2020-07-24 21:08:00 21.0 14.000000 False
9 2020-07-24 21:09:00 34.0 22.666667 False
Sidenote : if your data is not exactly minutly as in the example, you can set the date_and_time
column as index and use the .rolling()
function with a time-window. See pandas docs for rolling function .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.