如何在lm中指定参数估计之间的关系？

Question

使用lm，我想拟合模型：y = b0 + b1 * x1 + b2 * x2 + b1 * b2 * x1 * x2

我的问题是：如何指定交互作用的系数应等于主要作用系数的乘积？

我已经看到将系数设置为特定值，可以使用offset（）和I（），但是我不知道如何指定系数之间的关系。

这是一个简单的模拟数据集：

n <- 50 # Sample size
x1 <- rnorm(n, 1:n, 0.5) # Independent variable 1
x2 <- rnorm(n, 1:n, 0.5) # Independent variable 2
b0 <- 1 
b1 <- 0.5
b2 <- 0.2
y <- b0 + b1*x1 + b2*x2 + b1*b2*x1*x2 + rnorm(n,0,0.1)

为了适合模型1：y = b0 + b1 * x1 + b2 * x2 + b3 * x1 * x2，我将使用：

summary(lm(y~ x1 + x2 + x1:x2))

但是如何拟合模型2：y = b0 + b1 * x1 + b2 * x2 + b1 * b2 * x1 * x2？

两种模型之间的主要区别之一是要估计的参数数量。 在模型1中，我们估计了4个参数：b0（截距），b1（变量1的斜率），b2（变量2的斜率）和b3（变量1和2之间相互作用的斜率）。 在模型2中，我们估计3个参数：b0（截距），b1（变量1的斜率和变量1和2之间相互作用的斜率的一部分）和b2（变量2的斜率和变量2的斜率的一部分）。变量1和2之间的交互）

我要这样做的原因是，在调查模型2的x1和x2之间是否存在显着相互作用时，y = b0 + b1 * x1 + b2 * x2 + b1 * b2 * x1 * x2会更好。空模型比y = b0 + b1 * x1 + b2 * x2。

非常感谢！

玛丽

Answer 1

由于您施加在系数上的约束，因此您指定的模型不是线性模型，因此无法使用lm进行拟合。 您将需要使用非线性回归，例如nls 。

> summary(nls(y ~ b0 + b1*x1 + b2*x2 + b1*b2*x1*x2, start=list(b0=0, b1=1, b2=1)))

Formula: y ~ b0 + b1 * x1 + b2 * x2 + b1 * b2 * x1 * x2

Parameters:
   Estimate Std. Error t value Pr(>|t|)    
b0 0.987203   0.049713   19.86   <2e-16 ***
b1 0.494438   0.007803   63.37   <2e-16 ***
b2 0.202396   0.003359   60.25   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1121 on 47 degrees of freedom

Number of iterations to convergence: 5 
Achieved convergence tolerance: 2.545e-06

当您将模型重写为时，您真的可以看到模型是非线性的

> summary(nls(y ~ b0+(1+b1*x1)*(1+b2*x2)-1, start=list(b0=0, b1=1, b2=1)))

Formula: y ~ b0 + (1 + b1 * x1) * (1 + b2 * x2) - 1

Parameters:
   Estimate Std. Error t value Pr(>|t|)    
b0 0.987203   0.049713   19.86   <2e-16 ***
b1 0.494438   0.007803   63.37   <2e-16 ***
b2 0.202396   0.003359   60.25   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1121 on 47 degrees of freedom

Number of iterations to convergence: 5 
Achieved convergence tolerance: 2.25e-06

Answer 2

Brian提供了一种适合您指定的约束模型的方法，但是如果您对无约束模型是否比约束模型更合适感兴趣，可以使用delta方法来检验该假设。

# Let's make some fake data where the constrained model is true
n <- 100
b0 <- 2
b1 <- .2
b2 <- -1.3
b3 <- b1 * b2
sigma <- 1

x1 <- rnorm(n)
# make x1 and x2 correlated for giggles
x2 <- x1 + rnorm(n) 
# Generate data according to the model
y <- b0 + b1*x1 + b2*x2 + b3*x1*x2 + rnorm(n, 0, sigma)

# Fit full model y = b0 + b1*x1 + b2*x3 + b3*x1*x2 + error
o <- lm(y ~ x1 + x2 + x1:x2)

# If we want to do a hypothesis test of Ho: b3 = b1*b2
# this is the same as Ho: b3 - b1*b2 = 0
library(msm)
# Get estimate of the difference specified in the null
est <- unname(coef(o)["x1:x2"] - coef(o)["x1"] * coef(o)["x2"])
# Use the delta method to get a standard error for
# this difference
standerr <- deltamethod(~ x4 - x3*x2, coef(o), vcov(o))

# Calculate a test statistic.  We're relying on asymptotic
# arguments here so hopefully we have a decent sample size
z <- est/standerr
# Calculate p-value
pval <- 2 * pnorm(-abs(z))
pval

在这篇博客文章中，我将解释delta方法的用途，以及如何在R中使用它的更多信息。

扩展Brian的答案，您可以选择通过将完整模型与受限模型进行比较来进行此操作-但是，您必须使用nls来拟合完整模型，才能轻松比较模型。

o2 <- nls(y ~ b0 + b1*x1 + b2*x2 + b1*b2*x1*x2, start=list(b0=0, b1=1, b2=1))
o3 <- nls(y ~ b0 + b1*x1 + b2*x2 + b3*x1*x2, start = list(b0 = 0, b1 = 1, b2 = 1, b3 = 1))
anova(o2, o3)

Answer 3

无法完成lm您要的操作，也没有理由使其能够执行。 您运行lm以获取系数的估计值。 如果您不想估计系数，则不要在模型中包括预测变量。 您可以使用coef提取所需的系数，然后将它们相乘。

请注意，忽略交互是一个不同的模型，并且将产生不同的b1和b2。 您也可以保留I(x1 * x2) ，而不使用系数。

至于为什么要这样做，没有先验理由证明约束模型实际上比简单加性模型更适合。 拥有更多自由参数必然意味着模型更适合，但您还没有添加，而是添加了一个约束，在现实世界中，约束可能会使模型变得更糟。 在那种情况下，您认为与包括交互作用的模型相比，它是更好的“基准”吗？

如何在lm中指定参数估计之间的关系？

问题描述

3 个解决方案

解决方案1
5 已采纳 2013-10-11 19:06:55

解决方案2
5 2013-10-11 19:21:24

解决方案3
1 2013-10-11 12:52:20

如何在lm中指定参数估计之间的关系？

问题描述

3 个解决方案

解决方案1 5 已采纳 2013-10-11 19:06:55

解决方案2 5 2013-10-11 19:21:24

解决方案3 1 2013-10-11 12:52:20

解决方案1
5 已采纳 2013-10-11 19:06:55

解决方案2
5 2013-10-11 19:21:24

解决方案3
1 2013-10-11 12:52:20