繁体   English   中英

使用 dplyr 跨多个列的条件求和?

[英]Conditional sum across multiple columns using dplyr?

我有一个如下所示的数据集,但有大约 100 列不同的动物

location <- c("A","A","A","A","B","B","C","C","D", "D","D")
season <- c("2", "2", "3", "4","2","3","1","2","2","4","4")
cat <- c(1,1,1,1,0,1,1,1,0,1,0)
dog <- c(0,0,1,1,1,1,0,1,0,1,1)

df <- data.frame(location, season,cat, dog)

     location season  cat  dog
1          A       2    1    0
2          A       2    1    0
3          A       3    1    1
4          A       4    1    1
5          B       2    0    1
6          B       3    1    1
7          C       1    1    0
8          C       2    1    1
9          D       2    0    0
10         D       4    1    1
11         D       4    0    1

我正在尝试根据位置和季节对所有动物列求和,但我想要一个物种列及其对应的总列,用于位置和季节的每个独特组合。 并非所有动物列对于位置和季节的每种组合都有 1 值,并且它们都有不同的名称(即不同的动物)。 我想删除物种总数 = 0 的任何位置和季节行

像这样的东西:

    location season species n
1         A       2    cat  2  
2         A       3    cat  1
3         A       4    cat  1
4         B       3    cat  1
5         C       1    cat  1
6         C       2    cat  1
7         D       4    cat  1
8         A       3    dog  1
9         A       4    dog  1
10        B       2    dog  1
11        B       3    dog  1
12        C       2    dog  1
13        D       4    dog  2

我认为 dplyr 是通往 go 的方式,但我似乎无法正确理解。 谢谢!

df %>% group_by(location, season) %>%
  summarise(across(c(cat, dog), ~sum(.))) %>%
  pivot_longer(cols = c(cat, dog), names_to = "species", values_to = "n") %>%
  arrange(species, location, season) %>%
  filter(n != 0)

# A tibble: 13 x 4
# Groups:   location [4]
   location season species     n
   <chr>    <chr>  <chr>   <dbl>
 1 A        2      cat         2
 2 A        3      cat         1
 3 A        4      cat         1
 4 B        3      cat         1
 5 C        1      cat         1
 6 C        2      cat         1
 7 D        4      cat         1
 8 A        3      dog         1
 9 A        4      dog         1
10 B        2      dog         1
11 B        3      dog         1
12 C        2      dog         1
13 D        4      dog         2

获取长格式的数据,对于每个locationseasonSpecies sum值并删除具有 0 值的行。

library(dplyr)

df %>%
  tidyr::pivot_longer(cols = cat:dog, names_to = 'Species') %>%
  group_by(location, season, Species) %>%
  summarise(value = sum(value)) %>%
  ungroup %>%
  filter(value > 0)

#  location season Species value
#   <chr>    <chr>  <chr>   <dbl>
# 1 A        2      cat         2
# 2 A        3      cat         1
# 3 A        3      dog         1
# 4 A        4      cat         1
# 5 A        4      dog         1
# 6 B        2      dog         1
# 7 B        3      cat         1
# 8 B        3      dog         1
# 9 C        1      cat         1
#10 C        2      cat         1
#11 C        2      dog         1
#12 D        4      cat         1
#13 D        4      dog         2

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM