简体   繁体   English

多项式回归的置信区间

[英]Confidence interval of polynomial regression

I have a little issue with R and statistics. 我和R和统计数据有点问题。

I fitted a model with the Maximum Likelihood method, who gave me the following coefficients with their respective Standard Errors (among other parameters estimates): 我使用最大似然法拟合了一个模型,他给出了以下系数及其各自的标准误差(以及其他参数估计值):

    ParamIndex   Estimate     SE        
1         a0  0.2135187 0.02990105  
2         a1  1.1343072 0.26123775  
3         a2 -1.0000000 0.25552696  

From what I can draw my curve: 从我可以画出我的曲线:

 y= 0.2135187 + 1.1343072 * x - 1 * I(x^2)

But from that, I have now to calculate the confidence interval around this curve, and I don't have a clear idea how to do that. 但是从那时起,我现在要计算这条曲线周围的置信区间,我不清楚如何做到这一点。

Apparently, I should use the propagation or error/uncertainty, but the methods I found require the raw data, or more than just the polynomial formula. 显然,我应该使用传播或误差/不确定性,但我发现的方法需要原始数据,或者不仅仅是多项式公式。

Is there any method to calculate the CI of my curve when the SE of the estimates are known with R? 当用R知道估计值的SE时,有没有任何方法可以计算我的曲线的CI?

Thank you for your help. 谢谢您的帮助。


Edit: 编辑:

So, right now, I have the covariance table (v) obtain with the function vcov : 所以,现在,我使用函数vcov获得协方差表(v):

                 a0           a1           a2
    a0  0.000894073 -0.003622614  0.002874075
    a1 -0.003622614  0.068245163 -0.065114661
    a2  0.002874075 -0.065114661  0.065294027

and n = 279 . 并且n = 279

You don't have enough information right now. 您现在没有足够的信息。 To compute confidence interval of your fitted curve, a complete variance-covariance matrix for your three coefficients is required, but right now you only have diagonal entries of that matrix. 要计算拟合曲线的置信区间,需要三个系数的完全方差 - 协方差矩阵 ,但是现在您只有该矩阵的对角线条目。

If you have fitted an orthogonal polynomial, then variance-covariance matrix is diagonal, with identical diagonal elements. 如果已经拟合了正交多项式,则方差 - 协方差矩阵是对角线的,具有相同的对角线元素。 This is certainly not your case, as: 这肯定不是你的情况,因为:

  • standard errors you show are different from each other; 您显示的标准错误彼此不同;
  • you have explicitly used raw polynomial notation: x + I(x ^ 2) 你明确使用了原始多项式表示法: x + I(x ^ 2)

but the methods I found require the raw data 但我发现的方法需要原始数据

It's not "raw data" used for fitting the model. 它不是用于拟合模型的“原始数据”。 It is "new data" where you want to produce the confidence band. 它是您想要产生置信带的“新数据”。 However, you do need to know the number of data used for fitting the model, say n , as that is necessary to derive residual degree of freedom. 但是,您确实需要知道用于拟合模型的数据的数量,例如n ,因为这是导出剩余自由度所必需的。 In your case with 3 coefficients, this degree of freedom is n - 3 . 在你的情况下有3个系数,这个自由度是n - 3

Once you have: 一旦你有:

  • the full variance-covariance matrix, let's say V ; 完全方差 - 协方差矩阵,比方说V ;
  • n , the number of data used for model fitting; n ,用于模型拟合的数据数量;
  • a vector of points x giving where to produce confidence band, x的向量给出了产生置信带的位置,

you can first get prediction standard error from: 您可以先从以下方面获得预测标准误差:

X <- cbind(1, x, x ^ 2)    ## prediction matrix
e <- sqrt( rowSums(X * (X %*% V)) )    ## prediction standard error

You know how to get predicted mean, from your fitted polynomial formula, right? 你知道如何从拟合的多项式公式得到预测均值吗? Suppose the mean is mu , now for 95%-CI, use 假设平均值为mu ,现在为95%-CI,使用

## residual degree of freedom: n - 3
mu + e * qt(0.025, n - 3)  ## lower bound
mu - e * qt(0.025, n - 3)  ## upper bound

A complete theory is at How does predict.lm() compute confidence interval and prediction interval? 一个完整的理论是如何预测.lm()计算置信区间和预测区间?


Update 更新

Based on your provided covariance matrix, it is now possible to produce some result and figures. 根据您提供的协方差矩阵,现在可以生成一些结果和数字。

V <- structure(c(0.000894073, -0.003622614, 0.002874075, -0.003622614, 
0.068245163, -0.065114661, 0.002874075, -0.065114661, 0.065294027
), .Dim = c(3L, 3L), .Dimnames = list(c("a0", "a1", "a2"), c("a0", 
"a1", "a2")))

Suppose we want to produce CI at x = seq(-5, 5, by = 0.2) : 假设我们想要在x = seq(-5, 5, by = 0.2)处产生CI:

beta <- c(0.2135187, 1.1343072, -1.0000000)
x <- seq(-5, 5, by = 0.2)
X <- cbind(1, x, x ^ 2)
mu <- X %*% beta
e <- sqrt( rowSums(X * (X %*% V)) )
n <- 279
lo <- mu + e * qt(0.025, n - 3)
up <- mu - e * qt(0.025, n - 3)
matplot(x, cbind(mu, lo, up), type = "l", col = 1, lty = c(1,2,2))

在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM