[英]Split data frame conditional on factor level summarise based on duplicated values using dplyr
我有一個像這樣的數據框:
df<- data.frame(region= c("1","1","1","1","1","1","1","1","2","2"),
loc=c("A","A","A","B","B","B","C","D","E","F"), sp1=
c("a","a","b","a","e","e","e","e","a","a"), sp2=
c("b","b","c","b","f","f","f","f","b","b"), inter=
c("a_b","a_b","b_c","a_b","e_f","e_f","e_f","e_f","a_b","a_b"))
我希望通過組region
找到每個重復的水平inter
間loc
區域內再算上它發生了多少地塊中的輸出數據幀應顯示如下:
df<- data.frame(region= c("1","1","2"), sp1=
c("a","e","a"), sp2=
c("b","f","b"), inter=
c("a_b","e_f","a_b"), freq=c("2","3","2"))
我嘗試了以下方法:
df %>%
group_by(region,inter) %>%
filter(duplicated(inter))
您可以篩選出在每個region
和inter
組合中具有多於一行的組,然后使用n_distinct
來計算唯一位置的數量。 我將物種變量作為組包括在內,以將其保留在數據集中。
df %>%
group_by(region, sp1, sp2, inter) %>%
filter(n() > 1) %>%
summarise( n = n_distinct(loc) )
# A tibble: 3 x 5
# Groups: region, sp1, sp2 [?]
region sp1 sp2 inter n
<fctr> <fctr> <fctr> <fctr> <int>
1 1 a b a_b 2
2 1 e f e_f 3
3 2 a b a_b 2
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.