简体   繁体   English

根据每个其他列计算同一列的平均值

[英]Calculate mean value of the same column depending on each all other columns

Suppose that I have the following df 假设我有以下df

df <- structure(list(var1 = c(1, 0, 1, 0, 0 , 1 ), var2 = c(0, 
0, 0, 1, 1, 0), var99 = c(0, 1, 1, 1, 1, 0), value = c(154, 
120, 100, 180, 200, 460)), .Names = c("var1", "var2", "var99", "value" ), row.names = c(NA, -6L), class = "data.frame")

And I want to achieve this output data: 我想实现这个输出数据:

structure(list(var = c("var1", "var2", "var99"), mean = c(238, 
190, 150)), .Names = c("var", "mean"), row.names = c(NA, -3L), class = 
"data.frame")

This is: to obtain the mean value of column 'value' for every other column: var1, var2, ..., var99. 这是:获取每个其他列的列'value'的平均值:var1,var2,...,var99。 Only rows with 1's will be taken into account to compute the mean. 只考虑1的行将计算平均值。

I have done it with a for loop: 我用for循环完成了它:

l <- vector("list", 3)
for (i in 1:3)
l[[i]] <- mean(df$value[df[,i]==1], na.rm = T)
i <- i+1

Can anyone suggest me another approach omitting the loop with Base R when possible? 任何人都可以建议我在可能的情况下用Base R省略循环的另一种方法吗?

sapply(df[, -4], weighted.mean, x=df[, 4])

要么

colSums(sweep(df[, -4], 1, df[, 4], `*`)) / colSums(df[, -4])

要么:

sapply(subset(df, select = -value), function(x) mean(df$value[x == 1]))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM