简体   繁体   中英

How to combine and sum two rows in the same dataset using R

I have a single dataset consisting of two columns: "species_id" and "count". Some species are repeated but are named differently, ex: BROC and broc. I would like to combine these two rows into one row and sum their count values.

Currently, I have:

species_id count
BRBL       109
BROC       16
broc       7
BRSP       16

And I want:

species_id count
BRBL       109
BROC       23
BRSP       16

Thanks so much! Any help would be greatly appreciated.

Assuming the differences in names are only uppercase/lowercase something like this might work:

library(dplyr)
df <- data_frame(species_id = c("BROC", "broc"), count = c(16, 7)) #sample data
df %>% mutate(species_id = toupper(species_id)) %>% 
    group_by(species_id) %>% summarise(count = sum(count))

If there are differences beyond case then you would probably need to use regular expressions and other data cleaning techniques before grouping but the idea should be the same.

You can use

library(dplyr)
df = df %>% 
  mutate(species_id = tolower(as.character(species_id))) %>%
  group_by(species_id) %>%
  summarise(total = sum(count)) %>%
  ungroup()

Example:

df = data.frame(species_id = c("BROC","broc"),count = c(16,7))

Applying code above would result in

# A tibble: 1 x 2
  species_id total
       <chr> <dbl>
1       broc    23

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM