简体   繁体   English

如何在gam(mgcv软件包)中手动指定外部结点以使其更平滑

[英]How to manually specify outer knots for smoother in gam (mgcv package)

I am fitting GAM models to data using the mgcv package in R. Some of my predictors are circular, so I am using a periodic smoother. 我使用R中的mgcv软件包将GAM模型拟合到数据。我的一些预测变量是圆形的,因此我正在使用定期平滑器。 I run into an issue in cross validation where my holdout dataset can contain values outside the range of the training data. 我在交叉验证中遇到一个问题,其中我的保持数据集可能包含训练数据范围之外的值。 Since the gam package automatically chooses knots for the smooths, this leads to an error (see my related question here -- thanks to @nograpes and @DWin for their explanations of the errors there). 由于gam程序包会自动选择平滑的结,因此会导致错误(请参阅此处的相关问题-感谢@nograpes和@DWin对错误的解释)。

How can I manually specify the outer knots in a periodic smooth? 如何在周期性平滑中手动指定外部结?

Example code 范例程式码

The first block generates some data. 第一块产生一些数据。

library(mgcv)

set.seed(223) # produces error.
# set.seed(123) # no error.

# generate data:
x <- runif(100,min=-pi,max=pi)
linPred <- 2*cos(x) # value of the linear predictor
theta <- 1 / (1 + exp(-linPred)) # 
y <- rbinom(100,1,theta)
plot(x,theta)
df <- data.frame(x=x,y=y)

The next block fits the GAM model with the periodic smooth: 下一个块使GAM模型具有周期平滑度:

gamFit <- gam(y ~ s(x,bs="cc",k=5),data=df,family=binomial())
summary(gamFit)
plot(gamFit)

It will be somewhere in the specification of the smoother term s(x,bs="cc",k=5) where I'm sure you'll be able to set some knots, but this is not obvious to me from the help of gam or from googling. 它将在平滑术语s(x,bs="cc",k=5)的规范中的某处,我确定您可以设置一些结,但这对我来说并不明显gam或谷歌搜索。

This block will fit some holdout data and produce the error if you set the seed as above: 如果您如上所述设置种子,则此块将适合一些保留数据并产生错误:

# predict y values for new data:
x.2 <- runif(100,min=-pi,max=pi)
df.2 <- data.frame(x=x.2)
predict(gamFit,newdata=df.2)

Ideally, I would only set the outer knots and let gam pick the rest. 理想情况下,我只设置外部结,让gam选择其余部分。

Apologies if this question is better for CrossValidated than SO. 如果此问题对CrossValidated比对SO更好,则表示歉意。

Try this: 尝试这个:

gamFit <- gam(y ~ s(x,bs="cc",k=5), 
              knots=list( x=seq(-pi,pi, len=5) ), 
              data=df, family=binomial())

You will find a worked example at: 您可以在以下位置找到一个可行的示例:

?smooth.construct.cr.smooth.spec 

I learned in testing this code that the 'k' parameter in s() needs to match the 'len' parameter in the 'x'- seq() value passed to knots() . 我在测试此代码时了解到, s()中的'k'参数需要与传递给knots()的'x'- seq()值中的'len'参数匹配。 I thought incorrectly that the knots argument would get passed to s() . 我错误地认为knots参数将传递给s()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM