简体   繁体   中英

How to write an S3 formula method for a statistical model object in R

I have a function that carries out Box's M test for equality of covariance matrices in a multivariate linear model. I'd like to turn it into an S3 generic function with a formula method, which is the most natural interface.

The complete current code is at https://gist.github.com/friendly/749b5a69a067e02b87dd . I could paste it all in here, but perhaps that link is sufficient.

I don't understand a lot of the magic used in functions that access model object components. I used as a template the code I found in leveneTest in the car package, that solves a similar problem for univariate models.

Here is a quick test using the default method boxM.default :

data(iris)
res <- boxM(iris[, 1:4], iris[, "Species"])
res

which gives the desired result:

>     data(iris)
>     res <- boxM(iris[, 1:4], iris[, "Species"])
>     res

        Box's M-test for Homogeneity of Covariance Matrices

data:  iris[, 1:4]
Chi-Sq (approx.) = 140.94, df = 20, p-value < 2.2e-16
> 

When I try to call the formula method boxM.formula directly, it also works, giving the same output as above.

boxM( cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) ~ Species, data=iris)

However, this test of the boxM.lm method fails:

> iris.mod <- lm(cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) ~ Species, data=iris)
> boxM(iris.mod)
Error in cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) : 
  object 'Sepal.Length' not found
> traceback()
8: cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)
7: eval(expr, envir, enclos)
6: eval(predvars, data, env)
5: model.frame.default(form, data)
4: model.frame(form, data) at boxM.R#59
3: boxM.formula(formula(y), data = model.frame(y), ...) at boxM.R#76
2: boxM.lm(iris.mod) at boxM.R#2
1: boxM(iris.mod)
>

I think I understand why it fails --- something to do with the environment for finding the variables in the model.frame() , but not how to correct it.

Can someone help?

You designed your boxM function can take an lm object as an input. The implementation tries to extract the formula and the model.frame from the lm and reuse them with boxM.formula .

It seems the reason why this did not work out is because model.frame(iris.mod) does not return the original data.frame but a 2-column data.frame where the 1st column contains the matrix of left-hand side variables, and the 2nd the vector of the right-hand side. You can check this by

class(model.frame(iris.mod))
dim(model.frame(iris.mod))
names(model.frame(iris.mod))
model.frame(iris.mod)[,1]
model.frame(iris.mod)[,2]

Since model.frame(iris.mod) already parsed the data into the computable format, you can apply boxM.default instead of boxM.formula when an lm object is the input. For example, this seems to work:

boxM.default(Y = model.frame(iris.mod)[,1], 
             group = model.frame(iris.mod)[,2])

#    Box's M-test for Homogeneity of Covariance Matrices

#data:  model.frame(iris.mod)[, 1]
#Chi-Sq (approx.) = 140.94, df = 20, p-value < 2.2e-16

My colleague who solved this said, "You're getting bitten by nonstandard evaluation."

Here is a solution that works, and is more generally in accord with S3 methods for model objects. It finds the data in the environment of the model formula.

boxM.lm <- function(y, ...) {
  data <- getCall(y)$data
  y <- if (!is.null(data)) {
    data <- eval(data, envir = environment(formula(y)))
    update(y, formula(y), data = data)
  }
  else update(y, formula(y))

  boxM.formula(formula(y), data=eval(data, envir = environment(formula(y))), ...)
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM