简体   繁体   English

如何在R中为统计模型对象编写S3公式方法

[英]How to write an S3 formula method for a statistical model object in R

I have a function that carries out Box's M test for equality of covariance matrices in a multivariate linear model. 我有一个函数可以对多元线性模型中的协方差矩阵的相等性进行Box的M检验。 I'd like to turn it into an S3 generic function with a formula method, which is the most natural interface. 我想使用公式方法将其转换为S3泛型函数,这是最自然的接口。

The complete current code is at https://gist.github.com/friendly/749b5a69a067e02b87dd . 完整的当前代码位于https://gist.github.com/friendly/749b5a69a067e02b87dd I could paste it all in here, but perhaps that link is sufficient. 我可以将其全部粘贴到此处,但是也许该链接就足够了。

I don't understand a lot of the magic used in functions that access model object components. 我不了解访问模型对象组件的函数中使用的很多魔术。 I used as a template the code I found in leveneTest in the car package, that solves a similar problem for univariate models. 我将在leveneTest中的car包装中找到的代码用作模板,它解决了单变量模型的类似问题。

Here is a quick test using the default method boxM.default : 这是使用默认方法boxM.default的快速测试:

data(iris)
res <- boxM(iris[, 1:4], iris[, "Species"])
res

which gives the desired result: 这给出了预期的结果:

>     data(iris)
>     res <- boxM(iris[, 1:4], iris[, "Species"])
>     res

        Box's M-test for Homogeneity of Covariance Matrices

data:  iris[, 1:4]
Chi-Sq (approx.) = 140.94, df = 20, p-value < 2.2e-16
> 

When I try to call the formula method boxM.formula directly, it also works, giving the same output as above. 当我尝试直接调用公式方法boxM.formula ,它也起作用,给出与上述相同的输出。

boxM( cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) ~ Species, data=iris)

However, this test of the boxM.lm method fails: 但是,对boxM.lm方法的测试失败:

> iris.mod <- lm(cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) ~ Species, data=iris)
> boxM(iris.mod)
Error in cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) : 
  object 'Sepal.Length' not found
> traceback()
8: cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)
7: eval(expr, envir, enclos)
6: eval(predvars, data, env)
5: model.frame.default(form, data)
4: model.frame(form, data) at boxM.R#59
3: boxM.formula(formula(y), data = model.frame(y), ...) at boxM.R#76
2: boxM.lm(iris.mod) at boxM.R#2
1: boxM(iris.mod)
>

I think I understand why it fails --- something to do with the environment for finding the variables in the model.frame() , but not how to correct it. 我想我理解为什么会失败---与在model.frame()查找变量的环境有关,但与如何纠正它无关。

Can someone help? 有人可以帮忙吗?

You designed your boxM function can take an lm object as an input. 您设计的boxM函数可以将lm对象作为输入。 The implementation tries to extract the formula and the model.frame from the lm and reuse them with boxM.formula . 该实现尝试从lm提取公式和model.frame并与boxM.formula一起使用。

It seems the reason why this did not work out is because model.frame(iris.mod) does not return the original data.frame but a 2-column data.frame where the 1st column contains the matrix of left-hand side variables, and the 2nd the vector of the right-hand side. 似乎model.frame(iris.mod)此问题的原因是因为model.frame(iris.mod)不会返回原始data.frame而是返回一个两列data.frame,其中第一列包含左侧变量矩阵,第二个是右侧的向量 You can check this by 您可以通过以下方式检查

class(model.frame(iris.mod))
dim(model.frame(iris.mod))
names(model.frame(iris.mod))
model.frame(iris.mod)[,1]
model.frame(iris.mod)[,2]

Since model.frame(iris.mod) already parsed the data into the computable format, you can apply boxM.default instead of boxM.formula when an lm object is the input. 由于model.frame(iris.mod)已将数据解析为可计算格式, boxM.formula当输入lm对象时,可以应用boxM.default而不是boxM.formula For example, this seems to work: 例如,这似乎起作用:

boxM.default(Y = model.frame(iris.mod)[,1], 
             group = model.frame(iris.mod)[,2])

#    Box's M-test for Homogeneity of Covariance Matrices

#data:  model.frame(iris.mod)[, 1]
#Chi-Sq (approx.) = 140.94, df = 20, p-value < 2.2e-16

My colleague who solved this said, "You're getting bitten by nonstandard evaluation." 解决此问题的同事说:“您被不规范的评估所咬。”

Here is a solution that works, and is more generally in accord with S3 methods for model objects. 这是一个可行的解决方案,并且更普遍地符合模型对象的S3方法。 It finds the data in the environment of the model formula. 它在模型公式的环境中查找data

boxM.lm <- function(y, ...) {
  data <- getCall(y)$data
  y <- if (!is.null(data)) {
    data <- eval(data, envir = environment(formula(y)))
    update(y, formula(y), data = data)
  }
  else update(y, formula(y))

  boxM.formula(formula(y), data=eval(data, envir = environment(formula(y))), ...)
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM