简体   繁体   中英

Use grouped summary to operate in another data.frame column by factor

I want to compute a summary of a grouped data.frame , for example.

df_summ = mtcars %>% group_by(am) %>% summarise(mean_mpg=mean(mpg))

     am mean_mpg
  (dbl)    (dbl)
1     0 17.14737
2     1 24.39231

In order to later transform another data.frame that shares the same factor levels, but not the number of rows. For example, calculating the absolute difference from each group's mean of the single values.

Here's the toy example

toy=data.frame(am=c(1,1,0,0),mpg=c(1,2,3,4))

The calculation I would like to do would be y = abs(toy$mpg- df_summ$mean_mpg) by factor.

My head tells me dplyr must be able to do this but I can't come up with a way. I want to keep the original data.frame (as in, using mtcars %>% group_by(am) %>% mutate(...) )

The expected output looks like that

toy
  am mpg expected
1  1     1 23.39231
2  1     2 22.39231
3  0     3 14.14737
4  0     4 13.14737

Join the two data frames and then perform the calculation:

toy %>% 
    left_join(df_summ) %>% 
    mutate(y = abs(mpg - mean_mpg))

giving:

Joining, by = "am"
  am mpg mean_mpg        y
1  1   1 24.39231 23.39231
2  1   2 24.39231 22.39231
3  0   3 17.14737 14.14737
4  0   4 17.14737 13.14737

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM