How to combine and sum two rows in the same dataset using R

Question

I have a single dataset consisting of two columns: "species_id" and "count". Some species are repeated but are named differently, ex: BROC and broc. I would like to combine these two rows into one row and sum their count values.

Currently, I have:

species_id count
BRBL       109
BROC       16
broc       7
BRSP       16

And I want:

species_id count
BRBL       109
BROC       23
BRSP       16

Thanks so much! Any help would be greatly appreciated.

Answer 1

Assuming the differences in names are only uppercase/lowercase something like this might work:

library(dplyr)
df <- data_frame(species_id = c("BROC", "broc"), count = c(16, 7)) #sample data
df %>% mutate(species_id = toupper(species_id)) %>% 
    group_by(species_id) %>% summarise(count = sum(count))

If there are differences beyond case then you would probably need to use regular expressions and other data cleaning techniques before grouping but the idea should be the same.

Answer 2

You can use

library(dplyr)
df = df %>% 
  mutate(species_id = tolower(as.character(species_id))) %>%
  group_by(species_id) %>%
  summarise(total = sum(count)) %>%
  ungroup()

Example:

df = data.frame(species_id = c("BROC","broc"),count = c(16,7))

Applying code above would result in

# A tibble: 1 x 2
  species_id total
       <chr> <dbl>
1       broc    23

How to combine and sum two rows in the same dataset using R

Question

2 answers

solution1
1 2017-11-25 02:17:35

solution2
0 2017-11-25 02:18:50

How to combine and sum two rows in the same dataset using R

Question

2 answers

solution1 1 2017-11-25 02:17:35

solution2 0 2017-11-25 02:18:50

solution1
1 2017-11-25 02:17:35

solution2
0 2017-11-25 02:18:50