简体   繁体   中英

How to adjust my data.frame to value/second?

I have to do some analysis on the heart-rate (HR) measurement values of a device. Howver, this device gives a very odd output of HR/second. There is a column called 'duration' of how many seconds a certain HR was measured for, than in the same row there is a value for HR in the column 'heart_rate' and then there is another column with a date and time stamp called 'startdate'. However, the duration given for example in row 2 (let's say it is 3) is the duration from the startdate time stamp of row 1 (it means startdate in row 1 would be for example 06.09.21 07:24:23 and in row 2 06.09.21 07:24:26) and therefore this duration in row 2 depicts for how many seconds the 'heart_rate' value in row 1 was measured. It looks like this:

   duration heart_rate startdate          
      <dbl>      <dbl> <dttm>             
 1        1         74 2021-09-06 07:25:33
 2        1         74 2021-09-06 07:25:34
 3        2         71 2021-09-06 07:25:36
 4        4         71 2021-09-06 07:25:40
 5        2         72 2021-09-06 07:25:42
 6        6         72 2021-09-06 07:25:48
 7        2         74 2021-09-06 07:25:50
 8        5         76 2021-09-06 07:25:55
 9        4         75 2021-09-06 07:25:59
10        2         75 2021-09-06 07:26:01

I adjusted the 10 rows above to the desired format manually. What I want it to look is this:

  duration heart_rate startdate          
      <dbl>      <dbl> <dttm>             
 1        1         74 2021-09-06 07:25:33
 2        1         74 2021-09-06 07:25:34
 3        1         74 2021-09-06 07:25:35
 4        1         71 2021-09-06 07:25:36
 5        1         71 2021-09-06 07:25:37
 6        1         71 2021-09-06 07:25:38
 7        1         71 2021-09-06 07:25:39
 8        1         71 2021-09-06 07:25:40
 9        1         71 2021-09-06 07:25:41
10        1         72 2021-09-06 07:25:42
11        1         72 2021-09-06 07:25:43
12        1         72 2021-09-06 07:25:44
13        1         72 2021-09-06 07:25:45
14        1         72 2021-09-06 07:25:46
15        1         72 2021-09-06 07:25:47
16        1         72 2021-09-06 07:25:48
17        1         72 2021-09-06 07:25:49
18        1         74 2021-09-06 07:25:50
19        1         74 2021-09-06 07:25:51
20        1         74 2021-09-06 07:25:52
21        1         74 2021-09-06 07:25:53
22        1         74 2021-09-06 07:25:54
23        1         76 2021-09-06 07:25:55
24        1         76 2021-09-06 07:25:56
25        1         76 2021-09-06 07:25:57
26        1         76 2021-09-06 07:25:58
27        1         75 2021-09-06 07:25:59
28        1         75 2021-09-06 07:26:00

Additionally it is crucial to get the time stamp for every second within the whole data.frame, because the device produces alot of NA values, so I'd like to see for which time periods (when and how many seconds) the data is missing. I am new to R and this is a new kind of challenge I did not even closely had to handle so far, so I am kind of lost right now, as I have no idea on how to tackle this properly. Thank you everyone for your help!

sounds like a job for a rolling join.. (using data.table )

library(data.table)
# sample data
DT <- fread("   duration heart_rate startdate          
        1         74 2021-09-06T07:25:33
        1         74 2021-09-06T07:25:34
        2         71 2021-09-06T07:25:36
        4         71 2021-09-06T07:25:40
        2         72 2021-09-06T07:25:42
        6         72 2021-09-06T07:25:48
        2         74 2021-09-06T07:25:50
        5         76 2021-09-06T07:25:55
        4         75 2021-09-06T07:25:59
        2         75 2021-09-06T07:26:01")
DT[, startdate := as.POSIXct(startdate, "%Y-%m-%dT%H:%M:%S")]

# create new data.table by second
DT2 <- data.table( timestamp = seq(min(DT$startdate), max(DT$startdate), by = 1))
# join in data using a rolling join
DT2[, heart_rate := DT[DT2, heart_rate, on = .(startdate = timestamp), roll = Inf]]
#               timestamp heart_rate
#  1: 2021-09-06 07:25:33         74
#  2: 2021-09-06 07:25:34         74
#  3: 2021-09-06 07:25:35         74
#  4: 2021-09-06 07:25:36         71
#  5: 2021-09-06 07:25:37         71
#  6: 2021-09-06 07:25:38         71
#  7: 2021-09-06 07:25:39         71
#  8: 2021-09-06 07:25:40         71
#  9: 2021-09-06 07:25:41         71
# 10: 2021-09-06 07:25:42         72
# 11: 2021-09-06 07:25:43         72
# 12: 2021-09-06 07:25:44         72
# 13: 2021-09-06 07:25:45         72
# 14: 2021-09-06 07:25:46         72
# 15: 2021-09-06 07:25:47         72
# 16: 2021-09-06 07:25:48         72
# 17: 2021-09-06 07:25:49         72
# 18: 2021-09-06 07:25:50         74
# 19: 2021-09-06 07:25:51         74
# 20: 2021-09-06 07:25:52         74
# 21: 2021-09-06 07:25:53         74
# 22: 2021-09-06 07:25:54         74
# 23: 2021-09-06 07:25:55         76
# 24: 2021-09-06 07:25:56         76
# 25: 2021-09-06 07:25:57         76
# 26: 2021-09-06 07:25:58         76
# 27: 2021-09-06 07:25:59         75
# 28: 2021-09-06 07:26:00         75
# 29: 2021-09-06 07:26:01         75
#               timestamp heart_rate

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM