I am trying to calculate "year/base year" growth by columns .
I use a matrix M (in the reproducible code below) as my dataset. Every column of the matrix contains data from a year, ie, column 1 means year1, and so on.
My base year needs to restart every 4 years .
If I were to calculate a growth rate using diff
function ( (column2/column1) - 1 , then (column3/column2) - 1 ) , I would write:
estimate_r <- function(M, period = NULL, bycols = TRUE,
rm_cols = TRUE, only_dif = FALSE) {
if (is.null(period)) period <- max(dim(M)[2] * bycols, dim(M)[1] * !bycols)
if (dim(M)[2] %% period != 0 & bycols){
return(cat("Matrix columns are not divisible by ", period))
}
#this part needs to be modified-----------
# by columns
if (bycols) {
d_M <- t(diff(t(M)))
l <- dim(M)[2]
pop <- seq(period, l-1, period)
difs <- d_M[,-l]
if (only_dif) {
difs[,pop] <- NA
return(cbind(NA, difs))
}
lag_M <- M[,-l]
r <- difs / lag_M
colnames(r) <- colnames(difs)
r[,pop] <- NA
r <- cbind(NA, r)
return(r)
}
#-----------
# by rows
d_M <- diff(M)
d_M <- t(diff(t(M)))
l <- dim(M)[1]
# NAs instead of removing rows
pop <- seq(period, l, period)
difs <- d_M[-l,]
if (only_dif) {
difs[pop,] <- NA
return(rbind(NA, difs))
}
lag_M <- M[-l,]
r <- difs / lag_M
colnames(r) <- colnames(difs)
r[pop,] <- NA
r <- rbind(NA, r)
return(r)
}
# ---- reproducible matrices ----
M <- matrix(1:27, ncol = 9)
M
estimate_r(M, period = 3, bycols = TRUE, rm_cols = FALSE, only_dif = FALSE)
This gives me the matrix M (each column represents a year):
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 4 7 10 13 16 19 22 25
[2,] 2 5 8 11 14 17 20 23 26
[3,] 3 6 9 12 15 18 21 24 27
and the other one, with the growth from the previous year (r_t / r_{t-1} - 1), by columns, restarting every 4th year:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] NA 3.0 0.75 NA 0.3000000 0.2307692 NA 0.1578947 0.1363636
[2,] NA 1.5 0.60 NA 0.2727273 0.2142857 NA 0.1500000 0.1304348
[3,] NA 1.0 0.50 NA 0.2500000 0.2000000 NA 0.1428571 0.1250000
But I am unsure whether it is possible to make a variable lag using "diff", choosing 1 or 2 years, depending how far I am from the base year.
I need to do "year 2 over 1 (column 2 over 1)" growth, and then "year 3 over 1". Then restart the function in the year 4 and do the same, "year 5 over 4", etc.
My base years (columns) will be 1, 4 and 7, therefore.
Any help is much appreciated.
How about using data.table::shift(), which takes a variable n
value for the extent of the lag. Wrap the code in a function like this:
library(data.table)
get_n_prev_yr_growth <- function(M, yr=1) {
d = as.data.table(t(M))[,g:=rep(1:(ncol(M)/3),each=3)]
t(d[, lapply(.SD, \(i) i/shift(i, yr) - 1), g, .SDcols = patterns("^V")][,!c("g")])
}
Then just call the function, providing the input matrix and the lag value (default is 1)
get_n_prev_yr_growth(M=M,yr=1)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
V1 NA 3.0 0.75 NA 0.3000000 0.2307692 NA 0.1578947 0.1363636
V2 NA 1.5 0.60 NA 0.2727273 0.2142857 NA 0.1500000 0.1304348
V3 NA 1.0 0.50 NA 0.2500000 0.2000000 NA 0.1428571 0.1250000
Or for two years
get_n_prev_yr_growth(M=M,yr=2)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
V1 NA NA 6 NA NA 0.6000000 NA NA 0.3157895
V2 NA NA 3 NA NA 0.5454545 NA NA 0.3000000
V3 NA NA 2 NA NA 0.5000000 NA NA 0.2857143
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.