简体   繁体   中英

R - Split numeric vector into intervals

I have a question regarding the "splitting" of a vector, although different approaches might be feasible. I have a data.frame(df) which looks like this (simplified version):

   case time
1   1   5
2   2   3
3   3   4

The "time" variable counts units of time (days, weeks etc) until an event occurs. I would like to expand the data set by increasing the number of rows and "split" the "time" into intervals of length 1, beginning at 2. The result might then look something like this:

    case    time    begin   end
1   1       5       2       3
2   1       5       3       4
3   1       5       4       5
4   2       3       2       3
5   3       4       2       3
6   3       4       3       4

Obviously, my data set is a bit larger than this example. What would be a feasible method to achieve this result?

I had one idea of beginning with

df.exp <- df[rep(row.names(df), df$time - 2), 1:2]

in order to expand the number of rows per case, according to the number of time intervals. Based on this, a "begin" and "end" column might be added in the fashion of:

df.exp$begin <- 2:(df.exp$time-1)

However, I'm not successful at creating the respective columns, because this command only uses the first row to calculate (df.exp$time-1) and doesn't automatically distinguish by "case".

Any ideas would be very much appreciated!

You can try

df2 <- df1[rep(1:nrow(df1), df1$time-2),]
row.names(df2) <- NULL
m1 <- do.call(rbind,
          Map(function(x,y) {
                  v1 <- seq(x,y)
                  cbind(v1[-length(v1)],v1[-1L])},
                  2, df1$time))
df2[c('begin', 'end')] <- m1
df2
#  case time begin end
#1    1    5     2   3
#2    1    5     3   4
#3    1    5     4   5
#4    2    3     2   3
#5    3    4     2   3
#6    3    4     3   4

Or an option with data.table

library(data.table)
setDT(df1)[,{tmp <- seq(2, time)
               list(time= time,
                    begin= tmp[-length(tmp)],
                    end=tmp[-1])} , by = case]
#   case time begin end
#1:    1    5     2   3
#2:    1    5     3   4
#3:    1    5     4   5
#4:    2    3     2   3
#5:    3    4     2   3
#6:    3    4     3   4
library(data.table)
DT <- as.data.table(df)
DT[, rep(time, time-2), case][, begin := 2:(.N+1), case][, end := begin +1][]
#   case V1 begin end
#1:    1  5     2   3
#2:    1  5     3   4
#3:    1  5     4   5
#4:    2  3     2   3
#5:    3  4     2   3
#6:    3  4     3   4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM