简体   繁体   中英

R - Find the begin and the end of timestamp according with values in another column

Dears,

I am newby in R programming, for this reason, I come here to ask you for help. I am trying to figure out a way to solve this issue. I have been trying hard but without success.

I have a data.frame similar to that...

df2 <- data.frame(Recordig = c("Rec1", "Rec1", "Rec1", "Rec1", "Rec1", "Rec1", 
                               "Rec2","Rec2","Rec2","Rec2","Rec2","Rec2"), 
                  MediaName = c("Imagem1","Imagem1","Imagem1",
                                "Estimulo1","Estimulo1","Estimulo1",
                                "Imagem1","Imagem1","Imagem1",
                                "Estimulo1","Estimulo1","Estimulo1"),
                  Timestamp = c( 4975 , 5155 , 5312 ,25076, 25463 ,26040 , 5035 , 5248, 5551, 17047 , 17263,  17533))

simplified version below

 Recordig MediaName Timestamp
1      Rec1   Imagem1      4975
2      Rec1   Imagem1      5155
3      Rec1   Imagem1      5312
4      Rec1 Estimulo1     25076
5      Rec1 Estimulo1     25463
6      Rec1 Estimulo1     26040
7      Rec2   Imagem1      5035
8      Rec2   Imagem1      5248
9      Rec2   Imagem1      5551
10     Rec2 Estimulo1     17047
11     Rec2 Estimulo1     17263
12     Rec2 Estimulo1     17533

What is my point? I need to know exactly how much time the participant (eg Rec1) spent viewing each image (Image1). In this case, the Timestamp for Image1 started at 4.975s and ended at 5.312 s, giving 333 ms

The point is that I have hundreds of images and thousand of respondents that spent differents time for observing the images .

Is there anyone with some idea to help me, please?

You can find the difference between the first and last timestamp for each participant ( Recordig ) and image ( MediaName ) using the dplyr package:

library(dplyr)
df3 <- df2 %>% 
        dplyr::group_by(Recordig, MediaName) %>%
        dplyr::summarise(duration = diff(range(Timestamp)))

df3
# Source: local data frame [4 x 3]
# Groups: Recordig [?]
# 
#   Recordig MediaName duration
#     <fctr>    <fctr>    <dbl>
# 1     Rec1 Estimulo1      964
# 2     Rec1   Imagem1      337
# 3     Rec2 Estimulo1      486
# 4     Rec2   Imagem1      516

We can use base R

aggregate(cbind(duration = Timestamp) ~Recordig + MediaName, df2,
               FUN = function(x) diff(range(x)))
#    Recordig MediaName duration
#1     Rec1 Estimulo1      964
#2     Rec2 Estimulo1      486
#3     Rec1   Imagem1      337
#4     Rec2   Imagem1      516

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM