[英]percentage change for all columns in a row except one
小伙子们,
我有以下dataframe
obj <- data.frame (degree2 = c(1, 1, 2, 2, 3, 3, 4, 4),
yr = c(1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997),
degree = c(1, 1, 1, 2, 1, 1, 0, 0), degree3 = c(1, 1, 6, 7, 5, 1, 0, 0)
)
我想做的是为变量degree
degree2
degree3
每年创建百分比变化。 请注意,我真正的 dataframe 相当长。
我猜代码必须是这样的:“对于每一行计算除yr
之外的所有变量的百分比变化
提前谢谢了!
我们可以across
按 'yr' arrange
之后使用mutate
with cross(如果没有排序),并通过修改 cross 中across
.names
来创建新列
library(dplyr)
obj <- obj %>%
arrange(yr) %>%
mutate(across(starts_with('degree'),
~ 100 *c(0, diff(.))/., .names = '{.col}_perc_change'))
或者,如果我们需要从下一个到当前进行差异化,请使用lead
和-
obj <- obj %>%
arrange(yr) %>%
mutate(across(starts_with('degree'),
~ 100 * (lead(.) - .)/., .names = '{.col}_perc_change'))
使用正确的 function 更新以计算百分比变化:
感谢 ThoamslsCoding 和 akrun:
function计算: x
除以x
的lag
减去1
再乘以100
pct_change <- function(x) {(x/lag(x) -1) * 100}
obj %>%
mutate(across(c(degree2, degree, degree3), pct_change, .names = "pct_change_{.col}"))
Output:
degree2 yr degree degree3 pct_change_degree2 pct_change_degree pct_change_degree3
1 1 1990 1 1 NA NA NA
2 1 1991 1 1 0.0 0 0.0
3 2 1992 1 6 100.0 0 500.0
4 2 1993 2 7 0.0 100 16.7
5 3 1994 1 5 50.0 -50 -28.6
6 3 1995 1 1 0.0 0 -80.0
7 4 1996 0 0 33.3 -100 -100.0
8 4 1997 0 0 0.0 NaN NaN
第一个答案:不正确:
# function to calculate percentage change
pct_change <- function(x) {x/lag(x)}
obj %>%
mutate(across(c("degree2", "degree", "degree3"), pct_change))
output:
degree2 yr degree degree3 degree2_perc_change degree_perc_change degree3_perc_change
1 NA 1990 NA NA 0.0 0 0.0
2 1.00 1991 1.0 1.000 0.0 0 0.0
3 2.00 1992 1.0 6.000 50.0 0 83.3
4 1.00 1993 2.0 1.167 0.0 50 14.3
5 1.50 1994 0.5 0.714 33.3 -100 -40.0
6 1.00 1995 1.0 0.200 0.0 0 -400.0
7 1.33 1996 0.0 0.000 25.0 -Inf -Inf
8 1.00 1997 NaN NaN 0.0 NaN NaN
这是一个基本的 R 选项
> (obj[-1, -2] / obj[-nrow(obj), -2] - 1) * 100
degree2 degree degree3
2 0.00000 0 0.00000
3 100.00000 0 500.00000
4 0.00000 100 16.66667
5 50.00000 -50 -28.57143
6 0.00000 0 -80.00000
7 33.33333 -100 -100.00000
8 0.00000 NaN NaN
或者我们可以像下面这样绑定列
perc <- (obj[-1, -2] / obj[-nrow(obj), -2] - 1) * 100
perc <- setNames(perc, paste0(names(perc), "_perc_change"))
obj[rownames(perc), names(perc)] <- perc
这使
> obj
degree2 yr degree degree3 degree2_perc_change degree_perc_change
1 1 1990 1 1 NA NA
2 1 1991 1 1 0.00000 0
3 2 1992 1 6 100.00000 0
4 2 1993 2 7 0.00000 100
5 3 1994 1 5 50.00000 -50
6 3 1995 1 1 0.00000 0
7 4 1996 0 0 33.33333 -100
8 4 1997 0 0 0.00000 NaN
degree3_perc_change
1 NA
2 0.00000
3 500.00000
4 16.66667
5 -28.57143
6 -80.00000
7 -100.00000
8 NaN
在基础 R 中使用lapply
-
cols <- setdiff(names(obj), 'yr')
obj[paste0('perc_', cols)] <- lapply(obj[cols],function(x) c(0, diff(x))/x * 100)
obj
# degree2 yr degree degree3 perc_degree2 perc_degree perc_degree3
#1 1 1990 1 1 0.00000 0 0.00000
#2 1 1991 1 1 0.00000 0 0.00000
#3 2 1992 1 6 50.00000 0 83.33333
#4 2 1993 2 7 0.00000 50 14.28571
#5 3 1994 1 5 33.33333 -100 -40.00000
#6 3 1995 1 1 0.00000 0 -400.00000
#7 4 1996 0 0 25.00000 -Inf -Inf
#8 4 1997 0 0 0.00000 NaN NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.