I'm not sure a better way to phrase this for the title, which is probably impeding me being able to search for the answer.
I have a dataframe that looks like this:
example_df <- data.frame(
ID = c('A', 'A', 'A', 'B', 'B', 'C', 'C', 'C'),
location = c('park 1', 'park 1', 'park 2', 'park 3', 'park 1', 'park 4', 'park 1', 'park 5'),
sample_2000 = c(1, 5, 0, 2, 3, 1, 0, 8),
sample_2001 = c(2, 1, 1, 3, 5, 6, 4, 2),
sample_2003 = c(1, 2, 5, 8, 11, 1, 0, 7)
)
ID location sample_2000 sample_2001 sample_2003
1 A park 1 1 2 1
2 A park 1 5 1 2
3 A park 2 0 1 5
4 B park 3 2 3 8
5 B park 1 3 5 11
6 C park 4 1 6 1
7 C park 1 0 4 0
8 C park 5 8 2 7
I want to sum all the values for each year by location and end up with the results in the same dataframe. I'm currently using group_by() and summarize on each year individually and then joining everything back together:
library(dplyr)
summarize1 <- group_by(example_df, location) %>% dplyr::summarize(sample_2000 = sum(sample_2000))
summarize2 <- group_by(example_df, location) %>% dplyr::summarize(sample_2001 = sum(sample_2001))
summarize3 <- group_by(example_df, location) %>% dplyr::summarize(sample_2003 = sum(sample_2003))
all_summarized <- Reduce(function(x, y) merge(x, y, all=TRUE), list(summarize1, summarize2, summarize3))
Desired output (which I receive from the above) looks like this:
location sample_2000 sample_2001 sample_2003
1 park 1 9 12 14
2 park 2 0 1 5
3 park 3 2 3 8
4 park 4 1 6 1
5 park 5 8 2 7
Surely there's a better method. My attempt at a for-loop returns the following:
'Error in sum(paste0("sample_", i)): invalid 'type' (character) of argument'
year_list <- c(2000, 2001, 2003)
for (i in year_list) {
test <- group_by(example_df, location) %>% dplyr::summarize(paste0("sample_", i)) = sum(paste0("sample_", i))
}
Thank you!
If we want to use a similar approach to Reduce/merge
, then we can make use of map/reduce
from purrr
library(dplyr)
library(purrr)
map(names(example_df)[3:5], ~
example_df %>%
select(location, .x) %>%
group_by(location) %>%
summarise_at(vars(starts_with('sample')), sum)) %>%
reduce(full_join)
Or with summarise/across
(in the new version of dplyr
), we can get the same output (though not sure if the example is for a general case or something related to sum
only)
example_df %>%
group_by(location) %>%
summarise(across(starts_with('sample'), sum))
Or with summarise_at
from stable version of dplyr
(could be deprecated in the future)
example_df %>%
group_by(location) %>%
summarise_at(vars(starts_with('sample')), sum)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.