[英]subset dataframe by removing duplicates for each level of a factor
I have a data frame: 我有一个数据框:
df<- data.frame(region= c("1", "1", "1","1","1","1","1","1","2","2"),
plot=c("1", "1", "1","2","2","2", "3","3","3","3"),
interact=c("A_B", "C_D","C_D", "E_F","C_D","C_D", "D_E",
"D_E","C_B","A_B"))
And I would like to get count of all unique levels of interact
for each plot subset. 我想获得每个绘图子集的所有唯一
interact
级别的计数。 The final data frame would look like: 最终的数据帧如下所示:
result<-
Plot freq
1 2
2 2
3 3
I would like to use dplyr and have gotten this far: 我想使用dplyr并获得了以下效果:
df2 <-df %>% group_by(plot) %>%mutate(freq=length(unique((interact))))
But with the code above I have yet to figure out a way where only one value per plot is represented (ie. duplicate values in freq
for each unique plot are removed). 但是,使用上面的代码,我还没有找到一种方法来表示每个图只显示一个值(即,删除了每个唯一图的
freq
重复值)。
Try this . 尝试这个 。
df%>%group_by(plot)%>%summarise(n=length(unique(interact)))
plot n
1 1 2
2 2 2
3 3 3
or base on your own way. 或根据自己的方式
df2 <-df %>% group_by(plot) %>%mutate(freq=length(unique((interact))))
df2=df2[!duplicated(df2$plot),]
region plot interact freq
<fctr> <fctr> <fctr> <int>
1 1 1 A_B 2
2 1 2 E_F 2
3 1 3 D_E 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.