简体   繁体   English

在 R 中使用 lapply 迭代列时计算行值的百分比变化

[英]Calculating percentage change over row values while iterating through columns using lapply in R

I have a dataframe with weekly values for a large number of variables.我有一个数据框,其中包含大量变量的每周值。 I want to iterate through each column and obtain the weekly change per row and variable expressed as percent.我想遍历每一列并获得每行的每周变化和以百分比表示的变量。

Example:例子:

a = c(2,3,1,9)
b = c(4,5,8,1)
sentiment = cbind(a,b) %>% 
as.data.frame()`



Outcome should be: 
     a  b  a_delta  b_delta 
     2  4     NA      NA
     3  5     0.5     0.3
     1  8    -0.7     0.6
     9  1     8.0    -0.8

In my current approach I use two steps: (1) create a weekly lag, (2) calculate the percentage difference between the lagged value and the value.在我目前的方法中,我使用两个步骤:(1) 创建每周滞后,(2) 计算滞后值和值之间的百分比差异。 There is no error message, but the calculation is still incorrect and I am not sure why.没有错误消息,但计算仍然不正确,我不知道为什么。 Any help would be much appreciated!任何帮助将非常感激!

library(data.table) 

a = c(2,2.5,2,4)
b = c(4,5,8,1)
sentiment = cbind(a,b) %>% 
  as.data.frame()

setDT(sentiment)[, paste0(names(sentiment), "_delta") := lapply(.SD, function(x) shift(x, 1L, 
type="lag")/x -1)]

Here is a base R solution using sapply passed in a function to lapply that iterates over the columns of sentiment with the desired output column names using setNames .这是一个基本的 R 解决方案,使用sapply传入一个函数来lapply使用setNames迭代具有所需输出列名称的情绪列。

sentiment <- data.frame(a = c(2,3,1,9), b = c(4,5,8,1))
calc_lag <- function(x) {
  c(NA, round(sapply(2:length(x), function(y) {
    (x[y] - x[y-1]) / x[y-1]
  }), 1))
}
cbind(sentiment, lapply(setNames(sentiment, paste0(colnames(sentiment), '_lag')), calc_lag))
#  a b a_lag b_lag
#1 2 4    NA    NA
#2 3 5   0.5   0.2
#3 1 8  -0.7   0.6
#4 9 1   8.0  -0.9

We can use diff我们可以使用diff

library(dplyr)
sentiment %>%
      mutate_all(list(delta = ~ round(c(NA, diff(.))/lag(.), 1)))

Or if we use the devel version of dplyr或者如果我们使用dplyrdevel版本

sentiment %>% 
    mutate(across(everything(),  ~ round(c(NA, diff(.x))/lag(.x), 1), 
           names = "{col}_delta"))
#  a b a_delta b_delta
#1 2 4      NA      NA
#2 3 5     0.5     0.2
#3 1 8    -0.7     0.6
#4 9 1     8.0    -0.9

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM