如何使用 R 中的 dplyr 按行计算多个组的平均值？

Question

我有一个 dataframe，其中包含 4 列Age 、 Location 、 Distance和Value 。 Age和Location各有两个可能的值，而Distance可以有三个。 Value是观察到的连续变量，每个Distance测量了 3 次。

考虑到Age和Location ，我想计算其中一个Distance Value的平均值，然后在结合其他两个Distance时计算另一个平均值。 我试图回答，对于每个Age和Location ， Distance 0.5 相对于Distance 1.5 和 2.5 的Value是多少？

如何使用 dplyr 执行此操作？

示例数据

library(dyplyr)

set.seed(123)
df1 <- data.frame(matrix(ncol = 4, nrow = 36))
x <- c("Age","Location","Distance","Value")
colnames(df1) <- x
df1$Age <- rep(c(1,2), each = 18)
df1$Location <- as.character(rep(c("Central","North"), each = 9))
df1$Distance <- rep(c(0.5,1.5,2.5), each = 3)
df1$Value <- round(rnorm(36,200,25),0)

Output 应该看起来像这样

  Age Location Mean_0.5 Mean_1.5_and_2.5
1   1  Central      206              202
2   1    North      210              201
3   2  Central      193              186
4   2    North      202              214

Answer 1

在按“年龄”、“位置”分组后，我们可以使用%in%或==根据“距离”值（假设精度正确）对“值”进行子集化

library(dplyr)
df1 %>%
     group_by(Age, Location) %>% 
     summarise(Mean_0.5 = mean(Value[Distance == 0.5]), 
        Mean_1.5_and_2.5 = mean(Value[Distance %in% c(1.5, 2.5)]),
        .groups = 'drop')

-输出

# A tibble: 4 × 4
    Age Location Mean_0.5 Mean_1.5_and_2.5
  <dbl> <chr>       <dbl>            <dbl>
1     1 Central      206.             202.
2     1 North        210.             201.
3     2 Central      193              186.
4     2 North        202.             214.

如何使用 R 中的 dplyr 按行计算多个组的平均值？

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-09-27 20:32:04

如何使用 R 中的 dplyr 按行计算多个组的平均值？

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-09-27 20:32:04

解决方案1
1 已采纳 2021-09-27 20:32:04