简体   繁体   English

使用bs()函数进行样条曲线时如何解释lm()系数估计

[英]How to interpret lm() coefficient estimates when using bs() function for splines

I'm using a set of points which go from (-5,5) to (0,0) and (5,5) in a "symmetric V-shape". 我正在使用一组从(-5,5)(0,0)(5,5) ,呈“对称V形”。 I'm fitting a model with lm() and the bs() function to fit a "V-shape" spline: 我正在用lm()bs()函数拟合模型以拟合“ V形”样条线:

lm(formula = y ~ bs(x, degree = 1, knots = c(0)))

I get the "V-shape" when I predict outcomes by predict() and draw the prediction line. 当我通过predict()预测结果并绘制预测线时,我得到“ V形”。 But when I look at the model estimates coef() , I see estimates that I don't expect. 但是,当我查看模型估计coef() ,会看到我没有想到的估计。

Coefficients:
                                 Estimate Std. Error t value Pr(>|t|)  
(Intercept)                       4.93821    0.16117  30.639 1.40e-09 ***
bs(x, degree = 1, knots = c(0))1 -5.12079    0.24026 -21.313 2.47e-08 ***
bs(x, degree = 1, knots = c(0))2 -0.05545    0.21701  -0.256    0.805 

I would expect a -1 coefficient for the first part and a +1 coefficient for the second part. 我希望第一部分的系数为-1 ,第二部分的系数为+1 Must I interpret the estimates in a different way? 我是否必须以其他方式解释估算值?

If I fill the knot in the lm() function manually than I get these coefficients: 如果我手动在lm()函数中填充结,则得到以下系数:

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.18258    0.13558  -1.347    0.215    
x           -1.02416    0.04805 -21.313 2.47e-08 ***
z            2.03723    0.08575  23.759 1.05e-08 ***

That's more like it. 这还差不多。 Z's (point of knot) relative change to x is ~ +1 Z对x的结点相对变化为+1

I want to understand how to interpret the bs() result. 我想了解如何解释bs()结果。 I've checked, the manual and bs model prediction values are exact the same. 我已经检查过,手册和bs模型的预测值完全相同。

I would expect a -1 coefficient for the first part and a +1 coefficient for the second part. 我希望第一部分的系数为-1 ,第二部分的系数为+1

I think your question is really about what is a B-spline function . 我认为您的问题确实是关于什么是B样条函数 If you want to understand the meaning of coefficients, you need to know what basis functions are for your spline. 如果要了解系数的含义,则需要知道样条曲线的基函数是什么。 See the following: 请参阅以下内容:

library(splines)
x <- seq(-5, 5, length = 100)
b <- bs(x, degree = 1, knots = 0)  ## returns a basis matrix
str(b)  ## check structure
b1 <- b[, 1]  ## basis 1
b2 <- b[, 2]  ## basis 2
par(mfrow = c(1, 2))
plot(x, b1, type = "l", main = "basis 1: b1")
plot(x, b2, type = "l", main = "basis 2: b2")

基础

Note: 注意:

  1. B-splines of degree-1 are tent functions , as you can see from b1 ; b1可以看出,度为1的B样条是帐篷函数
  2. B-splines of degree-1 are scaled , so that their functional value is between (0, 1) ; 缩放 1度的B样条,使其功能值介于(0, 1)
  3. a knots of a B-spline of degree-1 is where it bends ; 弯曲度为1的B样条的
  4. B-splines of degree-1 are compact , and are only non-zero over (no more than) three adjacent knots. 1度的B样条曲线很紧凑 ,并且在三个相邻的结上仅是非零值(不超过)。

You can get the (recursive) expression of B-splines from Definition of B-spline . 您可以从B样条的定义中获得B样条的(递归)表达式。 B-spline of degree 0 is the most basis class, while 0度的B样条是最基类,而

  • B-spline of degree 1 is a linear combination of B-spline of degree 0 次数为1的B样条是次数为0的B样条的线性组合
  • B-spline of degree 2 is a linear combination of B-spline of degree 1 2级的B样条是1级的B样条的线性组合
  • B-spline of degree 3 is a linear combination of B-spline of degree 2 3度的B样条是2度的B样条的线性组合

(Sorry, I was getting off-topic...) (对不起,我没话题了...)

Your linear regression using B-splines: 使用B样条曲线的线性回归:

y ~ bs(x, degree = 1, knots = 0)

is just doing: 只是在做:

y ~ b1 + b2

Now, you should be able to understand what coefficient you get mean, it means that the spline function is: 现在,您应该能够理解所获得的平均系数,这意味着样条函数为:

-5.12079 * b1 - 0.05545 * b2

In summary table: 在汇总表中:

Coefficients:
                                 Estimate Std. Error t value Pr(>|t|)  
(Intercept)                       4.93821    0.16117  30.639 1.40e-09 ***
bs(x, degree = 1, knots = c(0))1 -5.12079    0.24026 -21.313 2.47e-08 ***
bs(x, degree = 1, knots = c(0))2 -0.05545    0.21701  -0.256    0.805 

You might wonder why the coefficient of b2 is not significant. 您可能想知道为什么b2的系数不重要。 Well, compare your y and b1 : Your y is symmetric V-shape , while b1 is reverse symmetric V-shape . 好吧,比较您的yb1 :您的y对称V形 ,而b1反向对称V形 If you first multiply -1 to b1 , and rescale it by multiplying 5, (this explains the coefficient -5 for b1 ), what do you get? 如果首先将-1乘以b1 ,然后乘以5重新缩放(这解释了b1的系数-5 ),那么您会得到什么? Good match, right? 很好的搭配,对不对? So there is no need for b2 . 因此,不需要b2

However, if your y is asymmetric, running trough (-5,5) to (0,0) , then to (5,10) , then you will notice that coefficients for b1 and b2 are both significant. 但是,如果y是不对称的,则将(-5,5)波谷从(0,0)扩展到(5,10) ,然后您会注意到b1b2系数都很重要。 I think the other answer already gave you such example. 我认为其他答案已经给您提供了这样的例子。


Reparametrization of fitted B-spline to piecewise polynomial is demonstrated here: Reparametrize fitted regression spline as piece-wise polynomials and export polynomial coefficients . 拟合的B样条曲线到分段多项式的重新参数化在此处得到证明: 拟合的回归样条曲线作为分段多项式和输出多项式系数的重新参数化

A simple example of first degree spline with single knot and interpretation of the estimated coefficients to calculate the slope of the fitted lines: 具有单结的一级样条的简单示例,并解释了估计系数以计算拟合线的斜率

library(splines)
set.seed(313)
x<-seq(-5,+5,len=1000)
y<-c(seq(5,0,len=500)+rnorm(500,0,0.25),
     seq(0,10,len=500)+rnorm(500,0,0.25))
plot(x,y, xlim = c(-6,+6), ylim = c(0,+8))
fit <- lm(formula = y ~ bs(x, degree = 1, knots = c(0)))
x.predict <- seq(-2.5,+2.5,len = 100)
lines(x.predict, predict(fit, data.frame(x = x.predict)), col =2, lwd = 2)

produces plot 产生情节 在此处输入图片说明 Since we are fitting a spline with degree=1 (ie straight line) and with a knot at x=0 , we have two lines for x<=0 and x>0 . 由于我们拟合的样条曲线的degree=1 (即直线),且结点在x=0 ,因此对于x<=0x>0 ,我们有两条线。

The coefficients are 系数是

> round(summary(fit)$coefficients,3)
                                 Estimate Std. Error  t value Pr(>|t|)
(Intercept)                         5.014      0.021  241.961        0
bs(x, degree = 1, knots = c(0))1   -5.041      0.030 -166.156        0
bs(x, degree = 1, knots = c(0))2    4.964      0.027  182.915        0

Which can be translated into the slopes for each of the straight line using the knot (which we specified at x=0 ) and boundary knots (min/max of the explanatory data): 可以使用结(我们在x=0处指定)和边界结(说明数据的最小值/最大值)将其转换为每个直线的斜率

# two boundary knots and one specified
knot.boundary.left <- min(x)
knot <- 0
knot.boundary.right <- max(x)

slope.1 <- summary(fit)$coefficients[2,1] /(knot - knot.boundary.left)
slope.2 <- (summary(fit)$coefficients[3,1] - summary(fit)$coefficients[2,1]) / (knot.boundary.right - knot)
slope.1
slope.2
> slope.1
[1] -1.008238
> slope.2
[1] 2.000988

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 lm()系数的弹性函数 - Elasticity function for lm() coefficient 如何解释 R 中 Multinom() Function 的系数表 - How to Interpret a Coefficient table for Multinom() Function in R 当使用“poly”function 生成预测变量时,R 仅返回非零系数估计值。 如何将零值放入向量中? - R is only returning non-zero coefficient estimates when using the “poly” function to generate predictors. How do I get the zero values into a vector? 匹配线性 model 的 lm 和 optim 系数估计与乘法误差 - Matching lm and optim coefficient estimates for linear model with multiplicative error 如何解释相关系数 - How to interpret correlation coefficient 当 lambda = 0 时,岭系数估计值与 OLS 估计值不匹配 - Ridge coefficient estimates do not match OLS estimates when lambda = 0 使用LM估计值列表作为观星者输入 - Using list of LM estimates as stargazer input 当公式=y~exp(x) 时如何解释 lm() 系数? - How to interpret lm() coefficients when formula=y~exp(x)? 计算没有 function lm 的 ols 系数 beta - Calculating the ols coefficient beta without function lm 如何使用嵌套的 for 循环用截距和系数估计值填充列的值? - How do i fill the values of a column with the intercept and coefficient estimates using a nested for loop?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM