简体   繁体   中英

SummaryBy function in R

> Score <- c(9.6 ,7.8,6.9,9.6,NA,NA,9.3,9.3,11.1,6.7,5.9,10.4,12.2,6.5,10.1,8.5,7.0,11.2,0.6,8.0)
> CNTRL <- c(rep(12,4), rep(14,4), rep(16,4), rep(18,4), rep(20,2), rep(22,2))
> SERV  <- c(rep(10, 5), rep(15,2), rep(20,13))
> LOS <- c(rep(1,5), rep(0,15))
> RESP <- c(rep(0,10), rep(1,10))
> DataAnalysis <- data.frame(Score, CNTRL, SERV, LOS, RESP)
> DataAnalysis$CNTRL <- as.factor(DataAnalysis$CNTR)
> DataAnalysis$SERV <- as.factor(DataAnalysis$SERV)
> str(DataAnalysis)
'data.frame':   20 obs. of  5 variables:
 $ Score: num  9.6 7.8 6.9 9.6 NA NA 9.3 9.3 11.1 6.7 ...
 $ CNTRL: Factor w/ 6 levels "12","14","16",..: 1 1 1 1 2 2 2 2 3 3 ...
 $ SERV : Factor w/ 3 levels "10","15","20": 1 1 1 1 1 2 2 3 3 3 ...
 $ LOS  : num  1 1 1 1 1 0 0 0 0 0 ...
 $ RESP : num  0 0 0 0 0 0 0 0 0 0 ...
> library(doBy)
> summaryBy(DataAnalysis$Score~DataAnalysis$CNTRL,data=DataAnalysis,FUN=c(mean, sd),na.rm=TRUE,
+           keep.names=TRUE)
  CNTRL DataAnalysis$Score.mean DataAnalysis$Score.sd
1    12                   8.475              1.350000
2    14                   9.300              0.000000
3    16                   8.525              2.605603
4    18                   9.325              2.417126
5    20                   9.100              2.969848
6    22                   4.300              5.232590
> summaryBy(DataAnalysis$Score~DataAnalysis$SERV,data=DataAnalysis,FUN=c(mean, sd),na.rm=TRUE,
+           keep.names=TRUE)
  SERV DataAnalysis$Score.mean DataAnalysis$Score.sd
1   10                8.475000              1.350000
2   15                9.300000                    NA
3   20                8.269231              3.065503
> summaryBy(DataAnalysis$Score~DataAnalysis$LOS,data=DataAnalysis,FUN=c(mean, sd),na.rm=TRUE,
+           keep.names=TRUE)
  LOS DataAnalysis$Score.mean DataAnalysis$Score.sd
1   0                8.342857              2.958096
2   1                8.475000              1.350000
> summaryBy(DataAnalysis$Score~DataAnalysis$RESP,data=DataAnalysis,FUN=c(mean, sd),na.rm=TRUE,
+           keep.names=TRUE)
  RESP DataAnalysis$Score.mean DataAnalysis$Score.sd
1    0                  8.7875              1.516045
2    1                  8.0400              3.345046

There is any way we can use loop or apply to apply "CNTRL, SERV, LOS, RESP..." at once. I want to create table include mean, sd , and p-value with dependent variable is Score (Continuous) and the rest are independent variables (category & continuous). I really appreciate.

Using the setup provided for in the question, lapply over the variable names:

fn <- function(nm) summaryBy(formula(paste("Score ~", nm)), DataAnalysis,
          FUN = c(mean, sd), na.rm = TRUE, keep.names = TRUE)
lapply(names(DataAnalysis)[-1], fn)

giving:

[[1]]
  CNTRL Score.mean Score.sd
1    12      8.475   1.3500
2    14      9.300   0.0000
3    16      8.525   2.6056
4    18      9.325   2.4171
5    20      9.100   2.9698
6    22      4.300   5.2326

[[2]]
  SERV Score.mean Score.sd
1   10     8.4750   1.3500
2   15     9.3000       NA
3   20     8.2692   3.0655

[[3]]
  LOS Score.mean Score.sd
1   0     8.3429   2.9581
2   1     8.4750   1.3500

[[4]]
  RESP Score.mean Score.sd
1    0     8.7875    1.516
2    1     8.0400    3.345

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM