简体   繁体   中英

create a dataframe or tibble based on values in different rows in another column

I have a column of event time offsets (ms) like this (but much bigger)

> ts_data = tibble( t = c(34, 78, 111, 165, 189))
> ts_data
# A tibble: 5 x 1
      t
  <dbl>
1    34
2    78
3   111
4   165
5   189

and I'd like to create a second column where the value in each row is the difference between the current row and the previous one (assuming t=0 at the start). So (by hand) for the above data I want to end up with this ..

> add_column(ts_data, t_int = c(34, 44, 33, 54, 24))
# A tibble: 5 x 2
      t t_int
  <dbl> <dbl>
1    34    34
2    78    44
3   111    33
4   165    54
5   189    24

ie 44 = 78-34; 33 = 111-78,...

I could do something with a loop but was sort of expecting that there might be a neater way using relative indexing however my quest to date has yet to bear fruit.

Any pointers would be appreciated :-)

An easier option with diff which returns a vector of length one less than the original vector (or column). So, append the first value of 't' to create the length equal as that of the original column

library(dplyr)
ts_data %>% 
   mutate(t_int = c(first(t), diff(t)))
# A tibble: 5 x 2
#      t t_int
#  <dbl> <dbl>
#1    34    34
#2    78    44
#3   111    33
#4   165    54
#5   189    24

Or take the difference of the original column with the lag of the column specifying the default as 0 (by default it is NA )

ts_data %>%
      mutate(t_int = t - lag(t, default = 0))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM