[英]Calculate age standardised rates by sub-group with confidence intervals in R
[英]aggregate data and calculate confidence intervals for sub cohort in R
我有一個數據集
testData <- structure(list(group = c("Group1", "Group1", "Group1", "Group1",
"Group1", "Group1", "Group1", "Group1", "Group1", "Group1", "Group1",
"Group1", "Group1", "Group1", "Group1", "Group1", "Group1", "Group1",
"Group1", "Group1", "Group1", "Group1", "Group1", "Group1", "Group1",
"Group1", "Group1", "Group1", "Group1", "Group1", "Group2", "Group2",
"Group2", "Group2", "Group2", "Group2", "Group2", "Group2", "Group2",
"Group2", "Group2", "Group2", "Group2", "Group2", "Group2", "Group2",
"Group2", "Group2", "Group2", "Group2", "Group2", "Group2"),
year = c(2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,
2015, 2015, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2016, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017,
2017, 2017, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2017, 2017, 2017,
2017, 2017, 2017, 2017), category = c("cat1", "cat1", "cat1",
"cat1", "cat1", "cat2", "cat2", "cat2", "cat2", "cat2", "cat1",
"cat1", "cat1", "cat1", "cat1", "cat2", "cat2", "cat2", "cat2",
"cat2", "cat1", "cat1", "cat1", "cat1", "cat1", "cat2", "cat2",
"cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2",
"cat2", "cat2", "cat3", "cat3", "cat3", "cat3", "cat3", "cat3",
"cat3", "cat3", "cat3", "cat3", "cat3", "cat1", "cat1", "cat1",
"cat1"), value = c(30.1660205462388, 96.1649663179749, 183.691571800985,
1.65328912643215, 9.30044741412784, 182.449748512614, 8.47095574122154,
23.3081277048748, 53.1188233968077, 34.250829201039, 50.5445297997031,
120.307165280983, 140.223343284331, 122.319359028798, 43.0193263100948,
134.417238652291, 106.437343685401, 84.0446901587849, 69.7099679759042,
132.101156129094, 27.8329259333861, 58.4953521410472, 100.379478360197,
77.2357869871934, 200.464054913284, 47.6252352008202, 109.598360734847,
18.1730751285375, 67.5769989539879, 26.7504753716622, 16.8630228114074,
75.2053705357279, 39.7641860921024, 126.658782796637, 64.8507816634371,
96.3471066298501, 61.4392604693245, 27.6801895514785, 181.599217867455,
11.1036117561468, 68.1516849014302, 115.899355317842, 167.032368398535,
116.634854779718, 144.080455202308, 186.627050299051, 72.3807151133032,
37.6345953992576, 2.09517321452513, 58.3682650864716, 54.3590148062561,
53.9884625670805)), row.names = c(NA, -52L), class = c("data.table",
"data.frame"))
我想聚合不同級別的數據,並計算相應聚合級別的value
的置信區間。 例如,我定義了兩個應該用於聚合的因子版本:
cohort1 = c("group" ,"category", "year")
cohort2 = c("group" ,"category")
我寫了一個 function 來計算置信區間:
calculateCI <- function(value){
avg <- mean(value)
s <- sqrt(var(value))
n <- length(value)
error <- qnorm(0.975)*s/sqrt(n)
lower <- avg - error
upper <- avg + error
return(list(lowerCI = lower,
upperCI = upper))
}
如何聚合數據並計算置信區間?
我試過取消 dplyr:
testData %>%
group_by(cohort) %>%
group_map(~ calculateCI(.x$value))
但它不適用於向量cohort
。 如何將向量作為group_by
的參數傳遞
此外,我希望以 data.table 的形式獲得結果,其中有一列用於上下置信區間:
group category year sumValue lowerCi upperCi
1: Group1 cat1 2015 320.9763 xxx yyy
2: Group1 cat2 2015 301.5985 xxx yyy
3: Group1 cat1 2016 476.4137 xxx yyy
4: Group1 cat2 2016 526.7104 xxx yyy
5: Group1 cat1 2017 464.4076 xxx yyy
6: Group1 cat2 2017 269.7241 xxx yyy
7: Group2 cat2 2016 481.1285 xxx yyy
8: Group2 cat3 2016 832.1817 xxx yyy
9: Group2 cat3 2017 296.6424 xxx yyy
10: Group2 cat1 2017 168.8109 xxx yyy
您可以按組計算平均值和 SD:
tapply(testData$value, INDEX = list(testData$group), FUN = mean, na.rm = TRUE)
tapply(testData$value, INDEX = list(testData$group), FUN = sd, na.rm = TRUE)
或多於一個因素:
tapply(testData$value, INDEX = list(testData$category, testData$group), FUN = mean, na.rm = TRUE)
並計算 CI
library(dplyr)
testData %>%
group_by(group, category, year) %>%
summarise(across(.cols = value, .fns = c("n" = ~n(), "Mean" = mean, "StDev" = sd), .names = "{.fn}"), .groups = "drop") %>%
mutate("StE" = StDev / sqrt(n)) %>%
mutate("LowerCI95" = Mean - (1.96 * StE),
"UpperCI95" = Mean + (1.96 * StE))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.