简体   繁体   中英

In R, how to combine two data frames where column names in one equals row values in another?

Say I have one data frame of tooth brush brands and a measure of how popular they are over time:

year  brand_1  brand_2
2010  0.7      0.3
2011  0.6      0.6
2012  0.4      0.9

And another that says when each tooth brush brand went electrical, with NA meaning they never did so:

brand    went_electrical_year
brand_1  NA
brand_2  2011

Now I'd like to combine these to get the prevalence of electrical tooth brush brands (as a proportion of the total) each year:

year  electrical_prevalence
2010  0
2011  0.5
2012  0.69

In 2010 it's 0 b/c none of the brands are electrical. In 2011 it's 0.5 b/c both are and they are equally prevalent. In 2012 it's 0.69 b/c both are but the electrical one is more prevalent.

I've wrestled with this in R but can't figure out a way to do it. Would appreciate any help or suggestions. Cheers.

Assuming your data frames are df1 and df2 , you can use the following tidyverse approach.

First, use pivot_longer to put your data into a long format which will be easier to manipulate. Use left_join to add the relevant years of when the brands went electrical.

We can create an indicator mult which will be 1 if the brand has gone electrical, or zero if it hadn't.

Then, for each year, you can determine the proportion by multiplying the popularity value by mult for each brand, and then dividing by the total sum for that year.

library(tidyverse)

df1 %>%
  pivot_longer(cols = -year) %>%
  left_join(df2, by = c("name" = "brand")) %>%
  mutate(mult = ifelse(went_electrical_year > year | is.na(went_electrical_year), 0, 1)) %>%
  group_by(year) %>%
  summarise(electrical_prevalence = sum(value * mult) / sum(value))

Output

   year electrical_prevalence
  <int>                 <dbl>
1  2010                 0    
2  2011                 0.5  
3  2012                 0.692

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM