I am dealing with a large dataset (~1 million obs) that includes time series data. In other words, my dataset includes multiple observations of a unique identifier ( id
) on a day-by-day basis (which, for the sake of providing a simple example, is just an integer value). For example, my data might look like this:
id var day
1 49 1
1 51 2
1 53 3
1 50 4
2 45 1
2 46 2
2 45 3
2 44 4
Now, I'd like to calculate the derivative of var
between successive days. In other words, I'd like to calculate the change in var
between day 1 and day 2, day 2 and day 3, etc. for each id
. The resulting dataset would thus look like this:
id var day deriv
1 49 1 NA
1 51 2 2
1 53 3 2
1 50 4 -3
2 45 1 NA
2 46 2 1
2 45 3 -1
2 44 4 -1
I suspect that there is some spectacularly simple solution using something like melt
that I don't know about. Any help appreciated!
Try:
> dfrm$deriv <- ave(dfrm$var, dfrm$id, FUN=function(v) c(NA, diff(v)) )
> dfrm
id var day deriv
1 1 49 1 NA
2 1 51 2 2
3 1 53 3 2
4 1 50 4 -3
5 2 45 1 NA
6 2 46 2 1
7 2 45 3 -1
8 2 44 4 -1
If d
is the matrix and the day
variable is ordered, try this:
do.call("c",lapply(unique(d[,1]),function(x){y <- d[d[,1]==x,];z <- y[2:nrow(y),]-y[1:(nrow(y)-1),]; c(NA,z[,2]/z[,3])}))
This would give you a vector corresponding to delta_var
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.