如何在lm中指定參數估計之間的關系？

Question

使用lm，我想擬合模型：y = b0 + b1 * x1 + b2 * x2 + b1 * b2 * x1 * x2

我的問題是：如何指定交互作用的系數應等於主要作用系數的乘積？

我已經看到將系數設置為特定值，可以使用offset（）和I（），但是我不知道如何指定系數之間的關系。

這是一個簡單的模擬數據集：

n <- 50 # Sample size
x1 <- rnorm(n, 1:n, 0.5) # Independent variable 1
x2 <- rnorm(n, 1:n, 0.5) # Independent variable 2
b0 <- 1 
b1 <- 0.5
b2 <- 0.2
y <- b0 + b1*x1 + b2*x2 + b1*b2*x1*x2 + rnorm(n,0,0.1)

為了適合模型1：y = b0 + b1 * x1 + b2 * x2 + b3 * x1 * x2，我將使用：

summary(lm(y~ x1 + x2 + x1:x2))

但是如何擬合模型2：y = b0 + b1 * x1 + b2 * x2 + b1 * b2 * x1 * x2？

兩種模型之間的主要區別之一是要估計的參數數量。 在模型1中，我們估計了4個參數：b0（截距），b1（變量1的斜率），b2（變量2的斜率）和b3（變量1和2之間相互作用的斜率）。 在模型2中，我們估計3個參數：b0（截距），b1（變量1的斜率和變量1和2之間相互作用的斜率的一部分）和b2（變量2的斜率和變量2的斜率的一部分）。變量1和2之間的交互）

我要這樣做的原因是，在調查模型2的x1和x2之間是否存在顯着相互作用時，y = b0 + b1 * x1 + b2 * x2 + b1 * b2 * x1 * x2會更好。空模型比y = b0 + b1 * x1 + b2 * x2。

非常感謝！

瑪麗

Answer 1

由於您施加在系數上的約束，因此您指定的模型不是線性模型，因此無法使用lm進行擬合。 您將需要使用非線性回歸，例如nls 。

> summary(nls(y ~ b0 + b1*x1 + b2*x2 + b1*b2*x1*x2, start=list(b0=0, b1=1, b2=1)))

Formula: y ~ b0 + b1 * x1 + b2 * x2 + b1 * b2 * x1 * x2

Parameters:
   Estimate Std. Error t value Pr(>|t|)    
b0 0.987203   0.049713   19.86   <2e-16 ***
b1 0.494438   0.007803   63.37   <2e-16 ***
b2 0.202396   0.003359   60.25   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1121 on 47 degrees of freedom

Number of iterations to convergence: 5 
Achieved convergence tolerance: 2.545e-06

當您將模型重寫為時，您真的可以看到模型是非線性的

> summary(nls(y ~ b0+(1+b1*x1)*(1+b2*x2)-1, start=list(b0=0, b1=1, b2=1)))

Formula: y ~ b0 + (1 + b1 * x1) * (1 + b2 * x2) - 1

Parameters:
   Estimate Std. Error t value Pr(>|t|)    
b0 0.987203   0.049713   19.86   <2e-16 ***
b1 0.494438   0.007803   63.37   <2e-16 ***
b2 0.202396   0.003359   60.25   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1121 on 47 degrees of freedom

Number of iterations to convergence: 5 
Achieved convergence tolerance: 2.25e-06

Answer 2

Brian提供了一種適合您指定的約束模型的方法，但是如果您對無約束模型是否比約束模型更合適感興趣，可以使用delta方法來檢驗該假設。

# Let's make some fake data where the constrained model is true
n <- 100
b0 <- 2
b1 <- .2
b2 <- -1.3
b3 <- b1 * b2
sigma <- 1

x1 <- rnorm(n)
# make x1 and x2 correlated for giggles
x2 <- x1 + rnorm(n) 
# Generate data according to the model
y <- b0 + b1*x1 + b2*x2 + b3*x1*x2 + rnorm(n, 0, sigma)

# Fit full model y = b0 + b1*x1 + b2*x3 + b3*x1*x2 + error
o <- lm(y ~ x1 + x2 + x1:x2)

# If we want to do a hypothesis test of Ho: b3 = b1*b2
# this is the same as Ho: b3 - b1*b2 = 0
library(msm)
# Get estimate of the difference specified in the null
est <- unname(coef(o)["x1:x2"] - coef(o)["x1"] * coef(o)["x2"])
# Use the delta method to get a standard error for
# this difference
standerr <- deltamethod(~ x4 - x3*x2, coef(o), vcov(o))

# Calculate a test statistic.  We're relying on asymptotic
# arguments here so hopefully we have a decent sample size
z <- est/standerr
# Calculate p-value
pval <- 2 * pnorm(-abs(z))
pval

在這篇博客文章中，我將解釋delta方法的用途，以及如何在R中使用它的更多信息。

擴展Brian的答案，您可以選擇通過將完整模型與受限模型進行比較來進行此操作-但是，您必須使用nls來擬合完整模型，才能輕松比較模型。

o2 <- nls(y ~ b0 + b1*x1 + b2*x2 + b1*b2*x1*x2, start=list(b0=0, b1=1, b2=1))
o3 <- nls(y ~ b0 + b1*x1 + b2*x2 + b3*x1*x2, start = list(b0 = 0, b1 = 1, b2 = 1, b3 = 1))
anova(o2, o3)

Answer 3

無法完成lm您要的操作，也沒有理由使其能夠執行。 您運行lm以獲取系數的估計值。 如果您不想估計系數，則不要在模型中包括預測變量。 您可以使用coef提取所需的系數，然后將它們相乘。

請注意，忽略交互是一個不同的模型，並且將產生不同的b1和b2。 您也可以保留I(x1 * x2) ，而不使用系數。

至於為什么要這樣做，沒有先驗理由證明約束模型實際上比簡單加性模型更適合。 擁有更多自由參數必然意味着模型更適合，但您還沒有添加，而是添加了一個約束，在現實世界中，約束可能會使模型變得更糟。 在那種情況下，您認為與包括交互作用的模型相比，它是更好的“基准”嗎？

如何在lm中指定參數估計之間的關系？

問題描述

3 個解決方案

解決方案1
5 已采納 2013-10-11 19:06:55

解決方案2
5 2013-10-11 19:21:24

解決方案3
1 2013-10-11 12:52:20

如何在lm中指定參數估計之間的關系？

問題描述

3 個解決方案

解決方案1 5 已采納 2013-10-11 19:06:55

解決方案2 5 2013-10-11 19:21:24

解決方案3 1 2013-10-11 12:52:20

解決方案1
5 已采納 2013-10-11 19:06:55

解決方案2
5 2013-10-11 19:21:24

解決方案3
1 2013-10-11 12:52:20