繁体   English   中英

如何计算 R 中多个变量多年来的变化

[英]How to calculate variation over many years for multiple variable in R

我想知道是否有一种方法可以快速计算另一个变量在一段时间内的变量变化?

这是我的数据示例:

   obj  time valA valB
1    A a2021  115 1840
2    B a2021   19  265
3    C a2021  158  803
4    D a2021   44  771
5    E a2021   86 1009
6    A a2020   76 1448
7    B a2020   23  360
8    C a2020  157  872
9    D a2020   39  778
10   E a2020   46 1106
11   A a2019   79 1720
12   B a2019   16  402
13   C a2019   99 1019
14   D a2019   28  640
15   E a2019   40  956
16   A a2018   69 1979
17   B a2018    7  360
18   C a2018   88 1051
19   D a2018   19  633
20   E a2018   NA   NA

我想找到一种方法来计算每个“obj”在“时间”中一年中 valA 和 valB 的变化。

我正在寻找类似的东西:

   obj  Inte valA valB
1    A 20-21  0.51 0.27
2    B 20-21   ...  ...
3    C 20-21  ...  ...
4    D 20-21   ... ...
5    E 20-21   ... ...
6    A 19-20   ...  
7    B 19-20   ... 
...

这是数据库的输入:

structure(list(obj = c("A", "B", "C", "D", "E", "A", "B", "C", 
"D", "E", "A", "B", "C", "D", "E", "A", "B", "C", "D", "E"), 
time = c("a2021", "a2021", "a2021", "a2021", "a2021", "a2020", 
"a2020", "a2020", "a2020", "a2020", "a2019", "a2019", "a2019", 
"a2019", "a2019", "a2018", "a2018", "a2018", "a2018", "a2018"
), valA = c(115, 19, 158, 44, 86, 76, 23, 157, 39, 46, 79, 
16, 99, 28, 40, 69, 7, 88, 19, NA), valB = c(1840, 265, 803, 
771, 1009, 1448, 360, 872, 778, 1106, 1720, 402, 1019, 640, 
956, 1979, 360, 1051, 633, NA)), row.names = c(NA, -20L), class = "data.frame")

非常感谢你

这里有一个解决方案,只要你的数据不回溯到2000年之前

library(dplyr)

df %>%
  group_by(obj) %>%
  # Take last two digit of time column only to prepare for Inte column
  mutate(time = as.numeric(substr(time, 4, 5))) %>%
  arrange(time) %>%
  group_by(obj) %>%
  # create Inte column using lag
  mutate(Inte = paste0(lag(time, 1), "-", time)) %>%
  # mutate variance for all val{x} columns
  mutate(across(starts_with("val"),
                ~ (.x - lag(.x, 1)) / lag(.x, 1),
                .names = "{.col}_var")) %>%
  # remove records that NA which is the one that doesn't have any record before it
  filter(!is.na(valA_var)) %>%
  ungroup() %>%
  # remove original time, val{x} column
  select(-time, -matches("val.$"))

这是 output

#> # A tibble: 14 × 4
#>    obj   Inte  valA_var valB_var
#>    <chr> <chr>    <dbl>    <dbl>
#>  1 A     18-19  0.145   -0.131  
#>  2 B     18-19  1.29     0.117  
#>  3 C     18-19  0.125   -0.0304 
#>  4 D     18-19  0.474    0.0111 
#>  5 A     19-20 -0.0380  -0.158  
#>  6 B     19-20  0.438   -0.104  
#>  7 C     19-20  0.586   -0.144  
#>  8 D     19-20  0.393    0.216  
#>  9 E     19-20  0.15     0.157  
#> 10 A     20-21  0.513    0.271  
#> 11 B     20-21 -0.174   -0.264  
#> 12 C     20-21  0.00637 -0.0791 
#> 13 D     20-21  0.128   -0.00900
#> 14 E     20-21  0.870   -0.0877

代表 package (v2.0.1) 于 2022 年 7 月 26 日创建

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM