如何按组计算所有变量的标准误差

Question

我有 dataframe 包含变量：

    Group   high  weigh age col5

row1   A       12    57   18   AA
row2   C       22    80   29   BB
row3   B       17    70   20   CC
row4   A       13    60   26   DD
row5   D       19    69   25   AA
row6   B       10    15   19   BB
row7   C       20    66   22   CC 
row8   D       13    53   18   DD

我想使用来自 package plotrix 的 function std.error 或使用其他方法（例如直接计算 sd/sqrt(length(data[,column])) 按组（第一列）中的所有定量误差计算标准误差，所以我想要的结果是

      Group   se_high   se_weigh  se_age     
row1   A       0.223       0.023    0.1   
row3   B       0.12        0.1      0.12   
row7   C       0.1         0.04     0.09
row8   D      0.05         0.12     0.07

我尝试使用 group_by dplyr fubction 对第一列进行分组，然后使用 std.error 但我不知道如何组合它们

#this is the dplyr function to calculate the mean by group
library(dplyr)
 data %>%
   group_by(group) %>% 
   summarise_at(vars("A", "B", "C","D"), mean)

我还想知道如何通过两组计算 std.error（例如第 1 列和最后一列 5）

谢谢

Answer 1

你很接近：Summarize_at 现在实际上已被弃用所以这就是我要做的：

library(dplyr)
data %>%
  group_by(Group) %>%
  summarize(se_high=plotrix::std.error(high),
            se_weigh=plotrix::std.error(weigh),
            se_age=plotrix::std.error(age))

返回

# A tibble: 4 x 4
  Group se_high se_weigh se_age
  <chr>   <dbl>    <dbl>  <dbl>
1 A         0.5      1.5    4  
2 B         3.5     27.5    0.5
3 C         1        7      3.5
4 D         3        8      3.5

Answer 2

这是一个 go 中的解决方案：

library(dplyr)

df %>%
  group_by(Group) %>%
  summarise(across(where(is.numeric), ~ sd(.x)/ sqrt(length(.x)), .names = "std_{.col}"))

# A tibble: 4 x 4
  Group std_high std_weigh std_age
  <chr>    <dbl>     <dbl>   <dbl>
1 A          0.5       1.5     4  
2 B          3.5      27.5     0.5
3 C          1         7       3.5
4 D          3         8       3.5

如何按组计算所有变量的标准误差

问题描述

2 个解决方案

解决方案1
1 2022-04-13 23:03:05

解决方案2
0 2022-04-13 23:21:12

如何按组计算所有变量的标准误差

问题描述

2 个解决方案

解决方案1 1 2022-04-13 23:03:05

解决方案2 0 2022-04-13 23:21:12

解决方案1
1 2022-04-13 23:03:05

解决方案2
0 2022-04-13 23:21:12