简体   繁体   English

带有dplyr的自定义函数会针对某个因素内的不同级别进行突变或汇总?

[英]Custom function with dplyr mutate or summarise for different levels within a factor?

Here is some example data: 以下是一些示例数据:

library(car)
library(dplyr)
    df1 <- mtcars %>%
                group_by(cyl, gear) %>%
                summarise(
                    newvar = sum(wt)
                )
# A tibble: 8 x 3
# Groups:   cyl [?]
    cyl  gear newvar
  <dbl> <dbl>  <dbl>
1     4     3   2.46
2     4     4  19.0 
3     4     5   3.65
4     6     3   6.68
5     6     4  12.4 
6     6     5   2.77
7     8     3  49.2 
8     8     5   6.74

What if I then wanted to apply a custom function calculating the difference between the newvar values for cars with 3 or 5 gears for each level of cylinder? 然后,如果我想应用一个自定义函数来为每个级别的汽缸计算3或5档汽车的newvar值之间的差异,该怎么办?

df2 <- df1 %>%  mutate(Diff = newvar[gear == "3"] - newvar[gear == "5"]) 

or with summarise? 还是总结一下?

df2 <- df1 %>%  summarise(Diff = newvar[gear == "3"] - newvar[gear == "5"])

There must be a way to apply functions for different levels within different factors? 必须有一种方法可以在不同因素下将功能应用于不同级别?

Any help appreciated! 任何帮助表示赞赏!

Your example code is most of the way there. 您的示例代码已完成大部分工作。 You can do: 你可以做:

df1 %>% 
    mutate(Diff = newvar[gear == "3"] - newvar[gear == "5"])

Or: 要么:

df1 %>% 
    summarise(Diff = newvar[gear == "3"] - newvar[gear == "5"])

Logical subsetting still works in mutate() and summarise() calls like with any other vector. 逻辑子集仍然可以像其他向量一样在mutate()summarise()调用中使用。

Note that this works because after your summarise() call in your example code, df1 is still grouped by cyl , otherwise you would need to do a group_by() call to create the correct grouping. 请注意,这是cyl ,因为在示例代码中调用summarise()之后, df1仍按cyl分组,否则,您需要执行group_by()调用以创建正确的分组。

An option is to spread into 'wide' format and then do the - 一种选择是spread为“宽”格式,然后执行-

library(tidyverse)
df1 %>%
   filter(gear %in% c(3, 5) ) %>% 
   spread(gear, newvar) %>% 
   transmute(newvar = `3` - `5`)
# A tibble: 3 x 2
# Groups:   cyl [3]
#    cyl newvar
#  <dbl>  <dbl>
#1     4  -1.19
#2     6   3.90
#3     8  42.5 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM