简体   繁体   中英

`gam` package: extra shift spotted when sketching data on `plot.gam`

I try to fit a GAM using the gam package (I know mgcv is more flexible, but I need to use gam here). I now have the problem that the model looks good, but in comparison with the original data it seems to be offset along the y-axis by a constant value, for which I cannot figure out where this comes from.

This code reproduces the problem:

library(gam)
data(gam.data)
x <- gam.data$x
y <- gam.data$y
fit <- gam(y ~ s(x,6))

fit$coefficients
#(Intercept)     s(x, 6) 
#   1.921819   -2.318771

plot(fit, ylim = range(y))
points(x, y)
points(x, y -1.921819, col=2)
legend("topright", pch=1, col=1:2, legend=c("Original", "Minus intercept"))

在此处输入图片说明

Chambers, JM and Hastie, TJ (1993) Statistical Models in S (Chapman & Hall) shows that there should not be an offset, and this is also intuitively correct (the smooth should describe the data).

I noticed something comparable in mgcv , which can be solved by providing the shift parameter with the intercept value of the model (because the smooth is seemingly centred). I thought the same could be true here, so I subtracted the intercept from the original data-points. However, the plot above shows this idea wrong. I don't know where the extra shift comes from. I hope someone here may be able to help me.

(R version. 3.3.1; gam version 1.12)

I think I should first explain various output in the fitted GAM model:

library(gam)
data(gam.data)
x <- gam.data$x
y <- gam.data$y
fit <-gam(y ~ s(x,6), model = FALSE)

## coefficients for parametric part
## this includes intercept and null space of spline
beta <- coef(fit)

## null space of spline smooth (a linear term, just `x`)
nullspace <- fit$smooth.frame[,1]

nullspace - x  ## all 0

## smooth space that are penalized
## note, the backfitting procedure guarantees that this is centred
pensmooth <- fit$smooth[,1]

sum(pensmooth)  ## centred
# [1] 5.89806e-17

## estimated smooth function (null space + penalized space)
smooth <- nullspace * beta[2] + pensmooth

## centred smooth function (this is what `plot.gam` is going to plot)
c0 <- mean(smooth)
censmooth <- smooth - c0

## additive predictors (this is just fitted values in Gaussian case)
addpred <- beta[1] + smooth

You can first verify that addpred is what fit$additive.predictors gives, and since we are fitting additive models with Gaussian response, this is also as same as fit$fitted.values .

What plot.gam does, is to plot censmooth :

plot.gam(fit, col = 4, ylim = c(-1.5,1.5))
points(x, censmooth, col = "gray")

Remember, there is

addpred = beta[0] + censmooth + c0

If you want to shift original data y to match this plot, you not only need to subtract intercept ( beta[0] ), but also c0 from y :

points(x, y - beta[1] - c0)

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM