简体   繁体   English

回归模型点估计

[英]Regression model point estimation

I'd like to retrieve the values of a second order polynomial regression line based on a list of values for a parameter. 我想根据参数的值列表检索二阶多项式回归线的值。

Here is the model: 这是模型:

fit <- lm(y ~ poly(age, 2) + height + age*height)

I would like to use a list of values for age and retrieve the value on the regression line, as well as the standard deviation and standard errors. 我想使用年龄值列表并检索回归线上的值,以及标准偏差和标准误差。 'age' is a continuous variable, but I want to create an array of discrete values and return the predicted values from the regression line. 'age'是一个连续变量,但我想创建一个离散值数组并从回归线返回预测值。

Example: 例:

age <- c(10, 11, 12, 13, 14)

Since you have an interaction term, the regression coefficients for either the linear or quadratic age term (or both together) only have meaning when you simultaneously specify what value of height is being considered. 由于您有一个交互项,因此当您同时指定要考虑的height值时,线性或二次age项(或两者一起)的回归系数仅具有意义。 So to get predictions when the height is at its mean value you would do this: 因此,要获得高度达到其平均值时的预测,您可以这样做:

predict(fit, data.frame(age=c(10, 11, 12, 13, 14), height=mean(height) ) )

bouncyball brings up a good point. bouncyball提出了一个很好的观点。 You asked about "standard deviation and standard errors", but coefficients and predictions don't have "standard deviations" as the term is usually used, but ratehr "standard errors of the estimate" usually shortened to just standard errors. 您询问“标准偏差和标准误差”,但系数和预测没有“标准偏差”,因为通常使用该术语,但速率“估计的标准误差”通常缩短为标准误差。

predict(fit, data.frame(age=c(10, 11, 12, 13, 14), height=mean(height) ), se.fit=TRUE  )

I suppose if you did a bootstrap run and looked at the standard deviations of the separate coefficients as an estimate of the std error of the coefficients, that might be argued to be a standard deviation, but it would be in the scale of the parameter space rather than on the scale of the variables. 我想如果你做了一个自举运行,并将单独系数的标准偏差看作系数的标准误差的估计,那可能被认为是一个标准偏差,但它会在参数空间的范围内而不是变量的规模。

Your data has 2 variables, so you need to provide both an age and a height. 您的数据有2个变量,因此您需要同时提供年龄和身高。

For example, using simulated data: 例如,使用模拟数据:

age = sample(10)
height = sort(rnorm(10, 6, 1))
y = sort(rnorm(10, 150, 30))

fit <- lm(y ~ age + poly(age, 2) + height + age*height)

To get predictions specify age and heights and then predict: 要获得预测,请指定年龄和高度,然后预测:

# I'm using my own heights, you should choose the values you're interested in
new.data <- data.frame(age=c(10, 11, 12, 13, 14) , 
                  height=c(5.7, 6.3, 5.8, 5.9, 6.0) )

> predict(fit, new.data)
           1            2            3            4            5 
132.76675715 137.70712251 113.39494557 102.07262016  88.84240532 

To get confidence bands for each prediction 获得每个预测的置信区间

> predict(fit, new.data, interval="confidence")
           fit            lwr          upr
1 132.76675715  96.0957812269 169.43773307
2 137.70712251  73.2174486246 202.19679641
3 113.39494557  39.5470153667 187.24287578
4 102.07262016   3.5466926099 200.59854771
5  88.84240532 -37.7404171712 215.42522781

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM