[英]Conditional sum across multiple columns using dplyr?
我有一個如下所示的數據集,但有大約 100 列不同的動物
location <- c("A","A","A","A","B","B","C","C","D", "D","D")
season <- c("2", "2", "3", "4","2","3","1","2","2","4","4")
cat <- c(1,1,1,1,0,1,1,1,0,1,0)
dog <- c(0,0,1,1,1,1,0,1,0,1,1)
df <- data.frame(location, season,cat, dog)
location season cat dog
1 A 2 1 0
2 A 2 1 0
3 A 3 1 1
4 A 4 1 1
5 B 2 0 1
6 B 3 1 1
7 C 1 1 0
8 C 2 1 1
9 D 2 0 0
10 D 4 1 1
11 D 4 0 1
我正在嘗試根據位置和季節對所有動物列求和,但我想要一個物種列及其對應的總列,用於位置和季節的每個獨特組合。 並非所有動物列對於位置和季節的每種組合都有 1 值,並且它們都有不同的名稱(即不同的動物)。 我想刪除物種總數 = 0 的任何位置和季節行
像這樣的東西:
location season species n
1 A 2 cat 2
2 A 3 cat 1
3 A 4 cat 1
4 B 3 cat 1
5 C 1 cat 1
6 C 2 cat 1
7 D 4 cat 1
8 A 3 dog 1
9 A 4 dog 1
10 B 2 dog 1
11 B 3 dog 1
12 C 2 dog 1
13 D 4 dog 2
我認為 dplyr 是通往 go 的方式,但我似乎無法正確理解。 謝謝!
df %>% group_by(location, season) %>%
summarise(across(c(cat, dog), ~sum(.))) %>%
pivot_longer(cols = c(cat, dog), names_to = "species", values_to = "n") %>%
arrange(species, location, season) %>%
filter(n != 0)
# A tibble: 13 x 4
# Groups: location [4]
location season species n
<chr> <chr> <chr> <dbl>
1 A 2 cat 2
2 A 3 cat 1
3 A 4 cat 1
4 B 3 cat 1
5 C 1 cat 1
6 C 2 cat 1
7 D 4 cat 1
8 A 3 dog 1
9 A 4 dog 1
10 B 2 dog 1
11 B 3 dog 1
12 C 2 dog 1
13 D 4 dog 2
獲取長格式的數據,對於每個location
、 season
和Species
sum
值並刪除具有 0 值的行。
library(dplyr)
df %>%
tidyr::pivot_longer(cols = cat:dog, names_to = 'Species') %>%
group_by(location, season, Species) %>%
summarise(value = sum(value)) %>%
ungroup %>%
filter(value > 0)
# location season Species value
# <chr> <chr> <chr> <dbl>
# 1 A 2 cat 2
# 2 A 3 cat 1
# 3 A 3 dog 1
# 4 A 4 cat 1
# 5 A 4 dog 1
# 6 B 2 dog 1
# 7 B 3 cat 1
# 8 B 3 dog 1
# 9 C 1 cat 1
#10 C 2 cat 1
#11 C 2 dog 1
#12 D 4 cat 1
#13 D 4 dog 2
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.