简体   繁体   中英

Error in contrasts: group by factor and the minus operator in `formula()` stops working

The error occurs when using group_by() on a factor, even though this factor is afterwards removed from the model using the minus operator ( - ) . My motivating example:

library(tidyverse)
df = mtcars %>% mutate(am = factor(am))
fits = df %>%
  group_by(am) %>%
  do(fit = lm(formula(mpg ~ . - am), .)) # Returns the error

Which gives the following error message:

Error in `contrasts<-`(` tmp `, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels

I get the same error, if I filter() instead of group:

fit_am0 = df %>% 
  filter(am == 0) %>%
  lm(formula(mpg ~ . - am), .) # Returns the error

It is as if the formula() function does not properly detect the minus operator ( - am ) when the variable I try to remove is a factor, ie the combination of the two. This is my guess, since the following examples work without error:

fits = mtcars %>% # `am` is numeric
  group_by(am) %>%
  do(fit = lm(formula(mpg ~ . - am), .)) # No error
fit_am0 = df %>%
  filter(am == 0) %>%
  select(-am) %>% # `am` removed prior to running model
  lm(formula(mpg ~ .), .) # No error
fits2 = mtcars %>% 
  mutate(vs = factor(vs)) %>% # A non-grouped factor, later removed
  group_by(am) %>%
  do(fit = lm(formula(mpg ~ . - vs), .)) # No error

Is this a bug? Or did I make an error in my motivating example?

I found a solution. Remove the factor in the data option instead of in the formula option, ie lm(formula = formula(mpg ~ .), data = select(., -am)) .

library(tidyverse)
df = mtcars %>% mutate(am = factor(am))
fits = df %>%
  group_by(am) %>%
  do(fit = lm(
    formula(mpg ~ .), 
    select(., -am)
  )) # No error

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM