简体   繁体   English

两组之间的绝对差异及其在 R 中每行的 95% 置信区间,并将其添加到特定列中的相应行

[英]absolute differences between 2 groups and their 95% confidence intervals in R for each row and add that to corresponding row in a specific column

I am trying to calculate the absolute differences between 2 groups and their 95% confidence intervals in R for each row and add that to corresponding row in a column named "Absolute.difference_95CI" in the same datafram.我正在尝试计算两组之间的绝对差异及其在 R 中每行的 95% 置信区间,并将其添加到同一数据帧中名为“Absolute.difference_95CI”的列中的相应行。 Any advise will be greatly appreciated任何建议将不胜感激

### my data ###
data <-read.table(text="
 Variable   Men Women   Absolute.difference_95CI
n   979488  317716  NA
Family.history.of.smoking   222153  79810   
Prior.MI    500340  166528  
Peripheral.vascular.disease 128795  50008   
Cerebrovascular.disease 173112  76815   
", header=T, sep="\t")

My code (obtained from this link )我的代码(从此链接获得)

data2<-data
for(i in 1:nrow(data2)) {       # for-loop over rows
  m=data2$Men
  w=data2$Women
  a<-prop.test(x=c(me,we), n=c(m,w), correct=FALSE);
  data2$Absolute.difference_95CI <- paste0( round ( (a[["estimate"]][1]- a[["estimate"]][2]), digits=3)," (",  round(a[["conf.int"]][1], digits=3),"-", round(a[["conf.int"]][2],digits=3),")")
  
}

First, convert your data into an improved longer format, with a separate column for n for men and women.首先,将您的数据转换为改进的更长格式,为男性和女性提供单独的n列。

library(data.table)
setDT(data)

data = melt(data, "Variable")[, N:=max(value[Variable=="n"]), variable][Variable!="n"]

Then, use a helper function f() as below, to conduct the prop.test() , and return the values in a list.然后,使用如下的帮助器 function f()执行prop.test() ,并返回列表中的值。

  pt = prop.test(x,n)
  list("men" = round(pt$estimate[1],d), "women" = round(pt$estimate[2],d),
       "diff" = round(pt$estimate[1] - pt$estimate[2],d), 
       "95% CI" = paste0("(",round(pt$conf.int[1],d), " : ", round(pt$conf.int[2],d),")")
  )
}

Then, apply the function to your long-formatted data, by Variable然后,通过Variable将 function 应用于您的长格式数据

data[, f(value, N),Variable]

Output: Output:

                      Variable   men women   diff            95% CI
1:   Family.history.of.smoking 0.227 0.251 -0.024 (-0.026 : -0.023)
2:                    Prior.MI 0.511 0.524 -0.013 (-0.015 : -0.011)
3: Peripheral.vascular.disease 0.131 0.157 -0.026 (-0.027 : -0.024)
4:     Cerebrovascular.disease 0.177 0.242 -0.065 (-0.067 : -0.063)

Input:输入:

data = structure(list(Variable = c("n", "Family.history.of.smoking", 
"Prior.MI", "Peripheral.vascular.disease", "Cerebrovascular.disease"
), Men = c(979488L, 222153L, 500340L, 128795L, 173112L), Women = c(317716L, 
79810L, 166528L, 50008L, 76815L)), row.names = c(NA, -5L), class = "data.frame")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM