[英]dplyr: group_by, sum various columns, and apply a function based on grouped row sums?
I'm trying to use dplyr to summarize a dataframe of bird species abundance in forests which are fragmented to some degree.我正在尝试使用 dplyr 来总结森林中鸟类物种丰度的 dataframe,这些鸟类在某种程度上是支离破碎的。
The first column, percent_cover, has 4 possible values: 10, 25, 50, 75. Then there are ten columns of bird species counts: 'species1' through 'species10'.第一列 percent_cover 有 4 个可能的值:10、25、50、75。然后有十列鸟类数量:“species1”到“species10”。
I want to group by percent_cover, then sum the other columns and calculate these sums as a percentage of the 4 row sums.我想按 percent_cover 分组,然后对其他列求和,并将这些总和计算为 4 行总和的百分比。
To get to the column sums is easy enough:要获得列总和很容易:
%>% group_by(Percent_cover) %>% summarise_at(vars(contains("species")), sum) %>% group_by(Percent_cover) %>% summarise_at(vars(contains("species")), sum)
...but what I need is sum/rowSum*100. ...但我需要的是 sum/rowSum*100。 It seems that some kind of 'rowwise' operation is needed.
似乎需要某种“逐行”操作。
Also, out of interest, why does the following not work?另外,出于兴趣,为什么以下不起作用?
%>% group_by(Percent_cover) %>% summarise_at(vars(contains("species")), sum*100) %>% group_by(Percent_cover) %>% summarise_at(vars(contains("species")), sum*100)
At this point, it's tempting to go back to 'for' loops....or Excel pivot tables.在这一点上,go 回到“for”循环是很诱人的......
To use dplyr
, try the following:要使用
dplyr
,请尝试以下操作:
library(dplyr)
df %>%
group_by(Percent_cover) %>%
summarise(across(contains("species"), sum)) %>%
mutate(rs = rowSums(select(., contains("species")))) %>%
mutate(across(contains('species'), ~./rs * 100)) -> result
result
For example, using mtcars
:例如,使用
mtcars
:
mtcars %>%
group_by(cyl) %>%
summarise(across(disp:wt, sum)) %>%
mutate(rs = rowSums(select(., disp:wt))) %>%
mutate(across(disp:wt, ~./rs * 100))
# cyl disp hp drat wt rs
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 54.2 42.6 2.10 1.18 2135.
#2 6 58.7 39.2 1.15 0.998 2186.
#3 8 62.0 36.7 0.567 0.702 7974.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.