如何聚合特定范围内的时间序列数据？

Question

I have a pandas dataframe that looks like this, whereby each row represents data collected on a different day (days 1 -> 5) for each participant (long form).我有一个 pandas dataframe 看起来像这样，其中每一行代表每个参与者在不同的一天（第 1 天 - > 5 天）收集的数据（长格式）。

ID    Heart_Rate
1         89
1         98
1         99 
1         73 
1         54
...
24        88
24        90
24        79
24        92
24        97

How can I aggregate the data over the first 3 days for each participant such that I create a new data frame with 1 row for each patient whereby the data represents the mean heart rate over 72 hours.如何汇总每个参与者前 3 天的数据，以便为每个患者创建一个包含 1 行的新数据框，其中数据代表 72 小时内的平均心率。

Answer 1

We can set the index of dataframe to ID then group the dataframe on level=0 and aggregate using head to select first three rows for each user ID then take mean on level=0 to get the average heart rate over the first 72 hours:我们可以将 dataframe 的index设置为ID ，然后将mean group到 level level=0 level=0并使用head聚合到72的前三行，以获得每个用户ID的前 7 个平均心率：

out = df.set_index('ID').groupby(level=0).head(3).mean(level=0)

Alternate approach which is more efficient but applicable only if there are always equal number of rows present corresponding to each user ID and dataframe is sorted on ID column:更有效但仅适用于每个用户ID对应的行数始终相等且 dataframe 在ID列上排序的替代方法：

n_days = 5 # Number of rows present for each user ID
n_days_to_avg = 3 # First n rows/days to average

m = np.isin(np.r_[:len(df)] % n_days, np.r_[:n_days_to_avg])
out = df[m].groupby('ID').mean()

>>> out

    Heart_Rate
ID            
1    95.333333
24   85.666667

如何聚合特定范围内的时间序列数据？

问题描述

1 个解决方案

解决方案1
2 已采纳 2021-03-18 15:18:48

如何聚合特定范围内的时间序列数据？

问题描述

1 个解决方案

解决方案1 2 已采纳 2021-03-18 15:18:48

解决方案1
2 已采纳 2021-03-18 15:18:48