![](/img/trans.png)
[英]Passing an external function (and arguments) to dplyr summarize or mutate
[英]dplyr: workflow to subset, summarize, and mutate new function
我試圖找出最有效的方法來實現一系列目標,以對我的數據進行分組、匯總列並根據匯總改變新列。
使用下面的示例數據,我想:
這篇文章幾乎可以幫助我,但我不想總結(跨())多個列: dplyr:group_by,對各個列求和,並根據分組行總和應用 function?
您將如何使用 dplyr 中的管道從“df_have”到“df_want”?
謝謝!
site <- c("X", "Y", "Y", "X", "X", "X", "Y", "X", "Y", "X", "Y", "Y", "X", "X", "X", "Y", "X", "Y")
trmt <- c("yes", "yes", "no", "no", "yes", "no", "no", "yes", "yes", "yes", "yes", "no", "no", "yes", "no", "no", "yes", "yes")
id <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9)
species <- c("a", "b", "a", "c", "d", "a", "e", "b", "d", "a", "b", "m", "c", "p", "a", "q", "r", "d")
count <- c(28, 17, 7, 8, 2, 9, 1, 5, 3, 12, 4, 18, 3, 30, 12, 21, 18, 6)
extra <- c("A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B")
df_have <- cbind(site, trmt, id, species, count, extra)
df_have <- as.data.frame(df_have)
df_have
site1 <- c("X", "Y", "Y", "X", "X", "Y", "Y", "X", "X", "Y" )
trmt1 <- c("yes", "yes", "no", "yes", "no", "no", "no", "yes", "yes", "yes" )
id1 <- c(1, 2, 3, 3, 4, 5, 5, 6, 7, 7, 8, 8, 9)
species1 <- c("a", "b", "a", "m", "c", "d", "p", "a", "e", "q", "b", "r", "d" )
sum <- c(40, 21, 7, 18, 11, 2, 30, 21, 1, 21, 5, 18, 9)
relabund <- c(100, 100, 38.9, 61.1, 100, 6.25, 93.75, 100, 4.54, 95.45, 27.74, 78.26, 100)
df_want <- cbind(site1, trmt1, id1, species1, sum, relabund)
df_want <- as.data.frame(df_want)
df_want
這是一個dplyr
選項
library(dplyr)
df_have %>%
group_by(site, trmt, id, species) %>%
summarise(sum = sum(as.integer(count)), .groups = "drop") %>%
group_by(id) %>%
mutate(relabund = sum / sum(sum) * 100) %>%
ungroup() %>%
arrange(id, species)
## A tibble: 13 x 6
# site trmt id species sum relabund
# <chr> <chr> <chr> <chr> <int> <dbl>
# 1 X yes 1 a 40 100
# 2 Y yes 2 b 21 100
# 3 Y no 3 a 7 28
# 4 Y no 3 m 18 72
# 5 X no 4 c 11 100
# 6 X yes 5 d 2 6.25
# 7 X yes 5 p 30 93.8
# 8 X no 6 a 21 100
# 9 Y no 7 e 1 4.55
#10 Y no 7 q 21 95.5
#11 X yes 8 b 5 21.7
#12 X yes 8 r 18 78.3
#13 Y yes 9 d 9 100
最后一個arrange()
命令只是為了匹配您預期的 output; 如果順序無關緊要,您可以省略。 還要注意count
列的數據是character
s,所以我們需要先轉換成integer
; 這可能應該在上游修復。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.