简体   繁体   中英

Binned physiological time series data in R: calculate duration spent in each bin

I have a dataset containing changes in mean arterial blood pressure (MAP) over time from multiple participants. Here is an example dataframe:

df=structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
                            2L, 2L, 2L, 2L, 2L), Time = structure(1:14, .Label = c("11:02:00", 
                                                                                   "11:03:00", "11:04:00", "11:05:00", "11:06:00", "11:07:00", "11:08:00", 
                                                                                   "13:30:00", "13:31:00", "13:32:00", "13:33:00", "13:34:00", "13:35:00", 
                                                                                   "13:36:00"), class = "factor"), MAP = c(90.27999878, 84.25, 74.81999969, 
                                                                                                                           80.87000275, 99.38999939, 81.51000214, 71.51000214, 90.08999634, 
                                                                                                                           88.75, 84.72000122, 83.86000061, 94.18000031, 98.54000092, 51
                                                                                   )), class = "data.frame", row.names = c(NA, -14L))

I have binned the data into groups: eg MAP 40-60, 60-80, 80-100 and added a unique flag (1, 2 or 3) in an additional column map_bin. This is my code so far:

library(dplyr)

#Mean Arterial Pressure                                                                                                                     
#Bin 1=40-60; Bin 2=60-80; Bin 3=80-100
map_bin=c("1","2","3")

output <- as_tibble(df) %>% 
  mutate(map_bin = case_when(
    MAP >= 40 & MAP < 60 ~ map_bin[1],
    MAP >= 60 & MAP < 80 ~ map_bin[2],
    MAP >= 80 & MAP < 100 ~ map_bin[3]
  ))

For each ID I wish to calculate, in an additional column, the total time MAP is in each bin. I expect the following output:

ID Time MAP map_bin map_bin_dur
1 11:02:00 90.27999878 3 5
1 11:03:00 84.25 3 5
1 11:04:00 74.81999969 2 2
1 11:05:00 80.87000275 3 5
1 11:06:00 99.38999939 3 5
1 11:07:00 81.51000214 3 5
1 11:08:00 71.51000214 2 2
2 13:30:00 90.08999634 3 6
2 13:31:00 88.75 3 6
2 13:32:00 84.72000122 3 6
2 13:33:00 83.86000061 3 6
2 13:34:00 94.18000031 3 6
2 13:35:00 98.54000092 3 6
2 13:36:00 51 1 1

Where map_bin_dur is the time in minutes that MAP for each individual resided in each bin. eg ID 1 had a MAP in Bin 3 for 5 minutes in total.

If you have Time column of 1 min-duration always you can use add_count -

library(dplyr)

output <- output %>% add_count(ID, map_bin, name = 'map_bin_dur')
output

#      ID Time       MAP map_bin map_bin_dur
#   <int> <fct>    <dbl> <chr>         <int>
# 1     1 11:02:00  90.3 3                 5
# 2     1 11:03:00  84.2 3                 5
# 3     1 11:04:00  74.8 2                 2
# 4     1 11:05:00  80.9 3                 5
# 5     1 11:06:00  99.4 3                 5
# 6     1 11:07:00  81.5 3                 5
# 7     1 11:08:00  71.5 2                 2
# 8     2 13:30:00  90.1 3                 6
# 9     2 13:31:00  88.8 3                 6
#10     2 13:32:00  84.7 3                 6
#11     2 13:33:00  83.9 3                 6
#12     2 13:34:00  94.2 3                 6
#13     2 13:35:00  98.5 3                 6
#14     2 13:36:00  51   1                 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM