简体   繁体   English

使用dplyr对R中所有因变量进行分组汇总统计

[英]Groupwise summary statistics for all dependent variables in R using dplyr

I am trying to generate groupwise (hearing - my independent variable, so HL and NH are the two groups) summary statistics (mean, sd, min, max, standard error etc. ) for each of the 10 dependent variables. 我正在尝试为10个因变量中的每一个生成分组统计(听力-我的自变量,所以HL和NH是两个组)摘要统计信息(均值,sd,min,max,标准误差等)。 I was able to do this for one variable (R_PTA) using these 2 codes: 我可以使用以下两个代码对一个变量(R_PTA)进行此操作:

1. 1。

RightPTA <- mydata %>% group_by(NHL) %>% summarise(n=length(R_PTA), mean_R_PTA=mean(R_PTA), sd_R_PTA=sd(R_PTA), se_R_PTA=sd(R_PTA)/sqrt(length(R_PTA)), min_R_PTA=min(R_PTA), max_R_PTA=max(R_PTA))

2. 2。

mydata
mean<-tapply(mydata$R_PTA, mydata$NHL, mean)
mean
sd<-tapply(mydata$R_PTA, mydata$NHL, sd)
sd
median<-tapply(mydata$R_PTA, mydata$NHL, median)
median
max<-tapply(mydata$R_PTA, mydata$NHL, max)
max
min<-tapply(mydata$R_PTA, mydata$NHL, min)
min
cbind(mean, sd, median, max, min)
round(cbind(mean, sd, median, max, min), digits = 1)
t1<-round(cbind(mean, sd, median, max, min), digits = 1)
t1

Here is the output: 这是输出:

RightearPTA
   mean  sd median  max min
HL 26.9 7.3   27.5 37.5 8.8
NH 11.6 4.1   12.5 16.2 2.5

I want the same exact thing for all the remaining 9 variables (L_PTA, B_PTA etc.) but in one shot if possible. 对于其余的9个变量(L_PTA,B_PTA等),我希望有完全相同的结果,但如果可能,请一次性进行。 Is there no way to do this? 有没有办法做到这一点? Do I have to code for each single dependent variable? 我是否必须为每个因变量编码? I am sure its out there, but I cant find it! 我确定它在那里,但是我找不到它! Any hep would be appreciated!! 任何帮助将不胜感激!

Consider a base R solution with by (the object-oriented wrapper to tapply to subset dataframe into factor groups) and nested sapply (to build matrix of stats). 考虑与基础R溶液by (面向对象的包装器tapply到子集数据帧到因子基团)和嵌套sapply (构建统计的矩阵)。 Below demonstrates with random, seeded data for 10 stats columns: 下面展示了10个统计信息列的随机种子数据:

set.seed(88)

df <- data.frame(
  GROUP = sapply(seq(50), function(i) sample(c("NH", "HL"), 1, replace=TRUE)),
  STAT1 = rnorm(50)*100,
  STAT2 = rnorm(50),
  STAT3 = runif(50)*100,
  STAT4 = runif(50),
  STAT5 = rgamma(50, shape = 2)*100,
  STAT6 = rgamma(50, shape = 2),
  STAT7 = rpois(50, lambda = 100)*100,
  STAT8 = rpois(50, lambda = 100),
  STAT9 = rexp(50, rate = 1)*100,
  STAT10 = rexp(50, rate = 1)
)

dfList <- by(df, df$GROUP, FUN = function(d)
                sapply(d[2:ncol(d)], function(i) 
                  c(mean = mean(i, na.rm=TRUE),
                    sd = sd(i, na.rm=TRUE),
                    median = median(i, na.rm=TRUE),
                    min = min(i, na.rm=TRUE),
                    max = max(i, na.rm=TRUE)
                  )
                )
            )

Output 输出量

dfList$HL

#              STAT1       STAT2     STAT3      STAT4     STAT5     STAT6      STAT7     STAT8      STAT9      STAT10
# mean     -6.594221 -0.04059519 52.990723 0.58753311 157.55220 1.9196911 10103.4483 101.17241 113.089148 0.771495372
# sd      102.512709  0.99159105 31.055376 0.27339871 152.37034 1.4880694   709.3673  10.02165 121.360898 0.720117072
# median    8.034055  0.01163562 56.416484 0.56894472 136.58274 1.5150241 10200.0000 103.00000  77.302150 0.599291434
# min    -199.786535 -1.84703449  1.345751 0.00207128  22.56936 0.1553518  8400.0000  82.00000   2.396641 0.006532798
# max     251.976970  2.55701655 98.612123 0.99413520 806.38484 7.1030277 11900.0000 120.00000 487.719745 3.133768953

dfList$NH

#             STAT1       STAT2      STAT3      STAT4    STAT5    STAT6      STAT7     STAT8      STAT9    STAT10
# mean     26.51853 -0.13748799 44.1973692 0.46621478 155.7555 1.880407  9961.9048 104.38095 150.596480 1.1243476
# sd       90.57645  0.77843518 29.9227560 0.30340507 121.5361 1.105004   868.6059   8.44083 131.123059 1.1627959
# median   24.52202 -0.02949522 46.1950960 0.33646282 114.7845 1.736198  9700.0000 105.00000 122.841835 0.7819896
# min    -105.54741 -1.58980314  0.2636007 0.02044767  17.3282 0.291350  8900.0000  89.00000   7.799051 0.1108107
# max     194.78958  1.35889041 96.0175463 0.99160167 434.5724 4.368176 12000.0000 120.00000 554.307036 5.1537741

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM