简体   繁体   中英

Calculating derivative (i.e. change in one variable based on change in another variable) of longitudinal dataset using R

I am dealing with a large dataset (~1 million obs) that includes time series data. In other words, my dataset includes multiple observations of a unique identifier ( id ) on a day-by-day basis (which, for the sake of providing a simple example, is just an integer value). For example, my data might look like this:

id    var    day
1     49     1
1     51     2
1     53     3
1     50     4
2     45     1
2     46     2
2     45     3
2     44     4

Now, I'd like to calculate the derivative of var between successive days. In other words, I'd like to calculate the change in var between day 1 and day 2, day 2 and day 3, etc. for each id . The resulting dataset would thus look like this:

id    var    day   deriv
1     49     1     NA
1     51     2     2
1     53     3     2
1     50     4     -3
2     45     1     NA
2     46     2     1
2     45     3     -1
2     44     4     -1

I suspect that there is some spectacularly simple solution using something like melt that I don't know about. Any help appreciated!

Try:

> dfrm$deriv <- ave(dfrm$var, dfrm$id, FUN=function(v) c(NA, diff(v)) )
> dfrm
  id var day deriv
1  1  49   1    NA
2  1  51   2     2
3  1  53   3     2
4  1  50   4    -3
5  2  45   1    NA
6  2  46   2     1
7  2  45   3    -1
8  2  44   4    -1

If d is the matrix and the day variable is ordered, try this:

do.call("c",lapply(unique(d[,1]),function(x){y <- d[d[,1]==x,];z <- y[2:nrow(y),]-y[1:(nrow(y)-1),]; c(NA,z[,2]/z[,3])}))

This would give you a vector corresponding to delta_var

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM