I have a dataframe with weekly values for a large number of variables. I want to iterate through each column and obtain the weekly change per row and variable expressed as percent.
Example:
a = c(2,3,1,9)
b = c(4,5,8,1)
sentiment = cbind(a,b) %>%
as.data.frame()`
Outcome should be:
a b a_delta b_delta
2 4 NA NA
3 5 0.5 0.3
1 8 -0.7 0.6
9 1 8.0 -0.8
In my current approach I use two steps: (1) create a weekly lag, (2) calculate the percentage difference between the lagged value and the value. There is no error message, but the calculation is still incorrect and I am not sure why. Any help would be much appreciated!
library(data.table)
a = c(2,2.5,2,4)
b = c(4,5,8,1)
sentiment = cbind(a,b) %>%
as.data.frame()
setDT(sentiment)[, paste0(names(sentiment), "_delta") := lapply(.SD, function(x) shift(x, 1L,
type="lag")/x -1)]
Here is a base R solution using sapply
passed in a function to lapply
that iterates over the columns of sentiment with the desired output column names using setNames
.
sentiment <- data.frame(a = c(2,3,1,9), b = c(4,5,8,1))
calc_lag <- function(x) {
c(NA, round(sapply(2:length(x), function(y) {
(x[y] - x[y-1]) / x[y-1]
}), 1))
}
cbind(sentiment, lapply(setNames(sentiment, paste0(colnames(sentiment), '_lag')), calc_lag))
# a b a_lag b_lag
#1 2 4 NA NA
#2 3 5 0.5 0.2
#3 1 8 -0.7 0.6
#4 9 1 8.0 -0.9
We can use diff
library(dplyr)
sentiment %>%
mutate_all(list(delta = ~ round(c(NA, diff(.))/lag(.), 1)))
Or if we use the devel
version of dplyr
sentiment %>%
mutate(across(everything(), ~ round(c(NA, diff(.x))/lag(.x), 1),
names = "{col}_delta"))
# a b a_delta b_delta
#1 2 4 NA NA
#2 3 5 0.5 0.2
#3 1 8 -0.7 0.6
#4 9 1 8.0 -0.9
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.