How can I label rows of one dataframe according to a range specified in 2 columnns (start and end) of another dataframe?

Question

Apologies if this has been asked before - I tried to search but I might not know the right terms to search for. I have data in the following format:

in one data frame (utterances) I have the start and end frames of utterances in my data set

id <- c(1,1,1,2,2,2,2)
utterance_number <- c(1,2,3,1,2,3,4)
start_frame <- c(20,35,67,10,44,56,72)
end_frame <- c(29,44,72,15,52,69,82)

utterances <- cbind(id, utterance_number, start_frame, end_frame)
utterances

in another data frame I have all of the frames

id <- c(rep(1,80), rep(2,90))
frame <- c(seq(1:80), seq(1:90))
val1 <- sample(170)
val2 <- sample(170)

values <- cbind(id, frame, val1, val2)
values

I want to label each frame in values with its utterance_number, or with NA if it is not part of an utterance. So in a new column "Utterance_number" in values, the first 19 frames would be NA, frames 20-29 would be labelled "1" and so on.

What is the best way of doing this?

Answer 1

You can use merge and expand utterances using apply .

merge(values, do.call(rbind, apply(utterances, 1
  , function(x) cbind(id=x[1], frame=x[3]:x[4], utterance_number=x[2])))
 , all.x=TRUE)
#    id frame val1 val2 utterance_number
#1    1     1  166  138               NA
#2    1     2   54  109               NA
#3    1     3   71  103               NA
#4    1     4    9   48               NA
#...
#17   1    17   32   22               NA
#18   1    18  170  100               NA
#19   1    19   57  112               NA
#20   1    20   45  110                1
#21   1    21   25  148                1
#22   1    22   13   25                1
#...
#28   1    28   56   62                1
#29   1    29  130   47                1
#30   1    30  163   15               NA
#31   1    31  110   64               NA
#...

How can I label rows of one dataframe according to a range specified in 2 columnns (start and end) of another dataframe?

Question

1 answers

solution1
1 ACCPTED 2019-12-16 15:29:54

How can I label rows of one dataframe according to a range specified in 2 columnns (start and end) of another dataframe?

Question

1 answers

solution1 1 ACCPTED 2019-12-16 15:29:54

solution1
1 ACCPTED 2019-12-16 15:29:54