Sum columns based on index in a a different data frame in R

Question

I have two data frames similar to this:

df<-data.frame("A1"=c(1,2,3), "A2"=c(3,4,5), "A3"=c(6,7,8), "B1"=c(3,4,5))
ref_df<-data.frame("Name"=c("A1","A2","A3","B1"),code=c("Blue" ,"Blue","Green","Green"))

I would like to sum the values in the columns of df based on the code in the ref_df. I would like to store the results in a new data frame with column names matching the code in the ref_df

ie I would like a new data frame with Blue and Green as columns and the values representing the sum of A1+A2 and A3&B1 respectively. Like the one here:

result<-data.frame("Blue"=c(4,6,8), "Green"=c(9,11,13))

There are lots of post on summing columns based on conditions, but after a morning of research I cannot find any thing that solves my exact problem.

Answer 1

We can split the columns in df based on values in ref_df$code and then take row-wise sum.

sapply(split.default(df, ref_df$code), rowSums)

#     Blue Green
#[1,]    4     9
#[2,]    6    11
#[3,]    8    13

If the order in ref_df do not follow the same order as column names in df , arrange them first.

ref_df <- ref_df[match(ref_df$Name, names(df)),]

Answer 2

We can use tidyverse

library(dplyr)
library(tidyr)
df %>% 
  mutate(rn = row_number()) %>%
  pivot_longer(cols = -rn, names_to = 'Name') %>% 
  left_join(ref_df) %>% 
  group_by(code, rn) %>% 
  summarise(Sum = sum(value)) %>% 
  pivot_wider(names_from = code, values_from = Sum) %>% select(-rn)

Sum columns based on index in a a different data frame in R

Question

2 answers

solution1
1 ACCPTED 2020-02-03 10:19:45

solution2
0 2020-02-03 16:59:45

Sum columns based on index in a a different data frame in R

Question

2 answers

solution1 1 ACCPTED 2020-02-03 10:19:45

solution2 0 2020-02-03 16:59:45

solution1
1 ACCPTED 2020-02-03 10:19:45

solution2
0 2020-02-03 16:59:45