简体   繁体   中英

Error in terms.formula(formula, data = data) : invalid power in formula

I'm trying to take a transformation of the dose predictor, here is my code:

mod = glm(colonies ~ (as.numeric(as.factor(dose)))^(m), data = salmonella, family = "poisson")

where "m" is the power I use. However, I got an error

> mod = glm(colonies ~ (as.numeric(as.factor(dose)))^(m), data = salmonella, family = "poisson")
Error in terms.formula(formula, data = data) : invalid power in formula

Any one knows why?

Sorry for not being clear. Here my m is -0.18182 from an earlier calculation. I now understand I shouldn't use as.numeric(as.factor). But if the code is

mod = glm(colonies ~ (as.factor(dose))^(m), data = salmonella, family = "poisson")

The error is still here. It's weird because when I change m to 2, it works.

tl;dr my best guess is that you should use I(...^m) to protect ^ /have R treat it as a numerical exponentiation operator.

I found salmonella in the faraway package, and can confirm your error. Indeed, it persists through a variety of simplifications.

m <- 1  ## same results with m <- 2L, etc.
mod = glm(colonies ~ (as.numeric(as.factor(dose)))^(m), data = salmonella, family = "poisson")
mod = glm(colonies ~ dose^(m), data = salmonella, family = "poisson")
mod = glm(colonies ~ dose^m, data = salmonella, family = "poisson")
mod = lm(colonies ~ dose^m, data = salmonella)

It looks like R's formula interface does not allow symbolic substitution in a power in a formula.

However : if what you are really trying to do is

  • convert dose to an evenly spaced integer value (0=1, 10=2, 33=3)
  • use a power of that dose as a predictor in a GLM

then using I() to specify that R should treat ^ as a numeric operator, not an interaction operator in a formula, is what you want:

ss <- transform(salmonella, numdose=as.numeric(as.factor(dose)))
mod = glm(colonies ~ I(numdose^m), data = ss, family = "poisson")

OTOH the picture shows that this isn't completely crazy (although also unnecessary):

library(ggplot2); theme_set(theme_bw())
m <- 2
ggplot(ss,aes(numdose,colonies))+
    geom_point()+
    geom_smooth(method="glm",method.args=list(family=poisson))+
    geom_smooth(method="glm",method.args=list(family=poisson),
                formula=y~I(x^m),colour="red")
ggsave("numdose.png")

在此处输入图片说明

If this is the salmonella dataset from package:'faraway' then you not use either as.factor or as.numeric on the dose value, since it's already numeric.

converting to factor will seriously distort the meaning of "dose"

Furthermore, the proper way to do polynomial models in R is to use the poly function rather than forming quadratic terms. If you insist on using "raw" quadratic terms, then it would be easier to do with poly but as Ben has suggested it should be with the I function

library(faraday)
m=2
mod = glm(colonies ~ I(dose^m), data = salmonella, family = "poisson")

Better, however, would be:

 m=2; mod = glm(colonies ~ poly(dose, m), data = salmonella, family = "poisson")

Which will give you both the linear and the quadratic terms but the quadratic term will be done as an orthogonal polynomial which then lets you make proper inferences.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM