I have a single dataset consisting of two columns: "species_id" and "count". Some species are repeated but are named differently, ex: BROC and broc. I would like to combine these two rows into one row and sum their count values.
Currently, I have:
species_id count
BRBL 109
BROC 16
broc 7
BRSP 16
And I want:
species_id count
BRBL 109
BROC 23
BRSP 16
Thanks so much! Any help would be greatly appreciated.
Assuming the differences in names are only uppercase/lowercase something like this might work:
library(dplyr)
df <- data_frame(species_id = c("BROC", "broc"), count = c(16, 7)) #sample data
df %>% mutate(species_id = toupper(species_id)) %>%
group_by(species_id) %>% summarise(count = sum(count))
If there are differences beyond case then you would probably need to use regular expressions and other data cleaning techniques before grouping but the idea should be the same.
You can use
library(dplyr)
df = df %>%
mutate(species_id = tolower(as.character(species_id))) %>%
group_by(species_id) %>%
summarise(total = sum(count)) %>%
ungroup()
Example:
df = data.frame(species_id = c("BROC","broc"),count = c(16,7))
Applying code above would result in
# A tibble: 1 x 2
species_id total
<chr> <dbl>
1 broc 23
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.