[英]subset columns with common values in long data frame
I have the following data frame: 我有以下数据框:
Group 1 ID A Value
Group 1 ID B Value
Group 1 ID C Value
Group 2 ID B Value
Group 2 ID C Value
Group 3 ID B Value
… … …
I am trying to use dplyr to get the mean value for each of the same ID across groups (eg the mean of the value of ID B across group 1, group 2, and group 3). 我正在尝试使用dplyr获取组中每个相同ID的平均值(例如,组1,组2和组3中ID B的平均值)。 However, not every group has all of the IDs so I wanted to subset so that only means for IDs which are in all groups get computed.
但是,并非每个组都具有所有ID,因此我想对它进行子集化,以便只计算所有组中的ID。 I know that I can
group_by(dataFrame, group) %>% filter subset %>% group_by(id) %>% mutate(mean)
but I don't know what code to place in the filter subset. 我知道我可以
group_by(dataFrame, group) %>% filter subset %>% group_by(id) %>% mutate(mean)
但是我不知道要在过滤器子集中放置什么代码。
How about 怎么样
df %>%
group_by(id) %>%
mutate(count = n()) %>%
filter(count != ngroups) %>% #...
So basically remove all the rows in the dataframe that correspond to an ID that doesn't appear in all groups, then perform the computation. 因此,基本上删除数据框中与未出现在所有组中的ID对应的所有行,然后执行计算。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.