简体   繁体   中英

multiply a data.frame by a column with matching condition

I want to multiply several columns from a df by a column from another df.

df1 and df2 have a column called "year_quarter". I want many columns from df1 to be multiplied by a column from df2 so that year_quarter matches.

df1

year_quarter   col1    col2    col3
2010Q1         5       0.34    0.45
2010Q1         4       0.45    0.34
2010Q1         6       0.63    0.86
2010Q1         2       0.75    0.45
2010Q2         3       0.78    0.27
2010Q2         5       0.43    0.38
2010Q2         2       0.34    0.74
2010Q2         1       0.87    0.35
2010Q3         5       0.34    0.45
2010Q3         8       0.54    0.42
2010Q3         9       0.23    0.45
2010Q3         3       0.74    0.34
2010Q4         2       0.72    0.78
2010Q4         7       0.62    0.91
2010Q4         2       0.74    0.10
2010Q4         6       0.73    0.09

df2

year_quarter    ratio
2010Q1          0.96
2010Q2          1.34
2010Q3          1.92
2010Q4          0.74

I want to multiply col1, col2 and col3 in df1 by ratio in df2 where the year_quarter matches in both dfs. ie if quarter_year in df1 = 2010Q1, then col1, col2 and col3 should be multiplied by 0.96 for all instances of 2010Q1 and so on.

We can do this using match . Matching the similar columns from both the dataframes and then getting the corresponding ratio value and multiplying it to df1 excluding 1st column.

df2$ratio[match(df1$year_quarter, df2$year_quarter)] * df1[-1]


#    col1   col2   col3
#1   4.80 0.3264 0.4320
#2   3.84 0.4320 0.3264
#3   5.76 0.6048 0.8256
#4   1.92 0.7200 0.4320
#5   4.02 1.0452 0.3618
#6   6.70 0.5762 0.5092
#7   2.68 0.4556 0.9916
#8   1.34 1.1658 0.4690
#9   9.60 0.6528 0.8640
#10 15.36 1.0368 0.8064
#11 17.28 0.4416 0.8640
#12  5.76 1.4208 0.6528
#13  1.48 0.5328 0.5772
#14  5.18 0.4588 0.6734
#15  1.48 0.5476 0.0740
#16  4.44 0.5402 0.0666

Using dplyr, you merge both datasets, then apply a mutate function to change the values of desired columns, eventually you can remove the ratio column if you don't need it with %>% select(-ratio)

library(dplyr)
left_join(df1, df2) %>% mutate_at(vars(starts_with("col")), funs(.*ratio))

The left join behaviour is such that if you have, for instance, a 2011 date in your first data.frame it will leave the row NA (same behaviour in Ronak's answer)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM