[英]How to calculate the t-statistics over many estimated coefficients (for the same variable, but across years) in r?
[英]How to calculate variation over many years for multiple variable in R
我想知道是否有一种方法可以快速计算另一个变量在一段时间内的变量变化?
这是我的数据示例:
obj time valA valB
1 A a2021 115 1840
2 B a2021 19 265
3 C a2021 158 803
4 D a2021 44 771
5 E a2021 86 1009
6 A a2020 76 1448
7 B a2020 23 360
8 C a2020 157 872
9 D a2020 39 778
10 E a2020 46 1106
11 A a2019 79 1720
12 B a2019 16 402
13 C a2019 99 1019
14 D a2019 28 640
15 E a2019 40 956
16 A a2018 69 1979
17 B a2018 7 360
18 C a2018 88 1051
19 D a2018 19 633
20 E a2018 NA NA
我想找到一种方法来计算每个“obj”在“时间”中一年中 valA 和 valB 的变化。
我正在寻找类似的东西:
obj Inte valA valB
1 A 20-21 0.51 0.27
2 B 20-21 ... ...
3 C 20-21 ... ...
4 D 20-21 ... ...
5 E 20-21 ... ...
6 A 19-20 ...
7 B 19-20 ...
...
这是数据库的输入:
structure(list(obj = c("A", "B", "C", "D", "E", "A", "B", "C",
"D", "E", "A", "B", "C", "D", "E", "A", "B", "C", "D", "E"),
time = c("a2021", "a2021", "a2021", "a2021", "a2021", "a2020",
"a2020", "a2020", "a2020", "a2020", "a2019", "a2019", "a2019",
"a2019", "a2019", "a2018", "a2018", "a2018", "a2018", "a2018"
), valA = c(115, 19, 158, 44, 86, 76, 23, 157, 39, 46, 79,
16, 99, 28, 40, 69, 7, 88, 19, NA), valB = c(1840, 265, 803,
771, 1009, 1448, 360, 872, 778, 1106, 1720, 402, 1019, 640,
956, 1979, 360, 1051, 633, NA)), row.names = c(NA, -20L), class = "data.frame")
非常感谢你
这里有一个解决方案,只要你的数据不回溯到2000年之前
library(dplyr)
df %>%
group_by(obj) %>%
# Take last two digit of time column only to prepare for Inte column
mutate(time = as.numeric(substr(time, 4, 5))) %>%
arrange(time) %>%
group_by(obj) %>%
# create Inte column using lag
mutate(Inte = paste0(lag(time, 1), "-", time)) %>%
# mutate variance for all val{x} columns
mutate(across(starts_with("val"),
~ (.x - lag(.x, 1)) / lag(.x, 1),
.names = "{.col}_var")) %>%
# remove records that NA which is the one that doesn't have any record before it
filter(!is.na(valA_var)) %>%
ungroup() %>%
# remove original time, val{x} column
select(-time, -matches("val.$"))
这是 output
#> # A tibble: 14 × 4
#> obj Inte valA_var valB_var
#> <chr> <chr> <dbl> <dbl>
#> 1 A 18-19 0.145 -0.131
#> 2 B 18-19 1.29 0.117
#> 3 C 18-19 0.125 -0.0304
#> 4 D 18-19 0.474 0.0111
#> 5 A 19-20 -0.0380 -0.158
#> 6 B 19-20 0.438 -0.104
#> 7 C 19-20 0.586 -0.144
#> 8 D 19-20 0.393 0.216
#> 9 E 19-20 0.15 0.157
#> 10 A 20-21 0.513 0.271
#> 11 B 20-21 -0.174 -0.264
#> 12 C 20-21 0.00637 -0.0791
#> 13 D 20-21 0.128 -0.00900
#> 14 E 20-21 0.870 -0.0877
由代表 package (v2.0.1) 于 2022 年 7 月 26 日创建
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.