簡體   English   中英

使用 dplyr 跨多個列的條件求和?

[英]Conditional sum across multiple columns using dplyr?

我有一個如下所示的數據集,但有大約 100 列不同的動物

location <- c("A","A","A","A","B","B","C","C","D", "D","D")
season <- c("2", "2", "3", "4","2","3","1","2","2","4","4")
cat <- c(1,1,1,1,0,1,1,1,0,1,0)
dog <- c(0,0,1,1,1,1,0,1,0,1,1)

df <- data.frame(location, season,cat, dog)

     location season  cat  dog
1          A       2    1    0
2          A       2    1    0
3          A       3    1    1
4          A       4    1    1
5          B       2    0    1
6          B       3    1    1
7          C       1    1    0
8          C       2    1    1
9          D       2    0    0
10         D       4    1    1
11         D       4    0    1

我正在嘗試根據位置和季節對所有動物列求和,但我想要一個物種列及其對應的總列,用於位置和季節的每個獨特組合。 並非所有動物列對於位置和季節的每種組合都有 1 值,並且它們都有不同的名稱(即不同的動物)。 我想刪除物種總數 = 0 的任何位置和季節行

像這樣的東西:

    location season species n
1         A       2    cat  2  
2         A       3    cat  1
3         A       4    cat  1
4         B       3    cat  1
5         C       1    cat  1
6         C       2    cat  1
7         D       4    cat  1
8         A       3    dog  1
9         A       4    dog  1
10        B       2    dog  1
11        B       3    dog  1
12        C       2    dog  1
13        D       4    dog  2

我認為 dplyr 是通往 go 的方式,但我似乎無法正確理解。 謝謝!

df %>% group_by(location, season) %>%
  summarise(across(c(cat, dog), ~sum(.))) %>%
  pivot_longer(cols = c(cat, dog), names_to = "species", values_to = "n") %>%
  arrange(species, location, season) %>%
  filter(n != 0)

# A tibble: 13 x 4
# Groups:   location [4]
   location season species     n
   <chr>    <chr>  <chr>   <dbl>
 1 A        2      cat         2
 2 A        3      cat         1
 3 A        4      cat         1
 4 B        3      cat         1
 5 C        1      cat         1
 6 C        2      cat         1
 7 D        4      cat         1
 8 A        3      dog         1
 9 A        4      dog         1
10 B        2      dog         1
11 B        3      dog         1
12 C        2      dog         1
13 D        4      dog         2

獲取長格式的數據,對於每個locationseasonSpecies sum值並刪除具有 0 值的行。

library(dplyr)

df %>%
  tidyr::pivot_longer(cols = cat:dog, names_to = 'Species') %>%
  group_by(location, season, Species) %>%
  summarise(value = sum(value)) %>%
  ungroup %>%
  filter(value > 0)

#  location season Species value
#   <chr>    <chr>  <chr>   <dbl>
# 1 A        2      cat         2
# 2 A        3      cat         1
# 3 A        3      dog         1
# 4 A        4      cat         1
# 5 A        4      dog         1
# 6 B        2      dog         1
# 7 B        3      cat         1
# 8 B        3      dog         1
# 9 C        1      cat         1
#10 C        2      cat         1
#11 C        2      dog         1
#12 D        4      cat         1
#13 D        4      dog         2

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM