繁体   English   中英

您对gam的预测如何? 带有可复制的示例

[英]How exactly do you predict in gam? With reproducible example

当您拟合了可能包含随机效应的模型时,如何在mgcv::gam预测?

该站点上带有“排除”技巧的其他线程对我不起作用( https://stats.stackexchange.com/questions/131106/predicting-with-random-effects-in-mgcv-gam


ya <- rnorm(100, 0, 1)
yb <- rnorm(100,0,1.5)
yc <- rnorm(100, 0, 2)
yd <- rnorm(100, 0, 2.5)

yy <- c(ya,yb,yc,yd) #so, now we've got data from 4 different groups. 
xx <- c(rep("a", 100), rep("b",100), rep("c",100),rep("d",100)) #groups
zz <- rnorm(400,0,1) #some other covariate

model <- gam(yy ~ zz + s(xx, bs = "re")) #the model

predictdata <- data.frame( zz = 5 )   #new data
predict(model, newdata = predictdata, exclude = "s(xx)")   #prediction

这会产生错误

Error in model.frame.default(ff, data = newdata, na.action = na.act) : 
  variable lengths differ (found for 'xx')
In addition: Warning messages:
1: In predict.gam(model, newdata = predictdata, exclude = "s(xx)") :
  not all required variables have been supplied in  newdata!

2: 'newdata' had 1 row but variables found have 400 rows 

我的mgcv软件包是最新的。

编辑:

如果将预测数据更改为

predictdata <- data.frame(zz = 5, xx = "f")

然后它说

Error in predict.gam(model, newdata = predictdata, exclude = "s(xx)") : 
  f not in original fit

我对您的示例进行了实验,即使您必须在newdata中指定用于拟合模型的原始数据集中包含的随机效果的值,“ exclude”语句也确实可以正常工作。 但是,这让我有些不安。 另一个需要注意的是,“排除”似乎不适用于具有按组分别估算的方差结构的模型(我尝试了另一个数据集),即类似于s(xx,s =“ re”,by =组)。 您可能想发布问题或将问题移到交叉验证中,以便其他统计学家/分析师可以看到它也许可以提供更好的答案。

下面是我的代码。 请注意,我更改了组a和d的均值,但总体均值应约为零。

ya <- rnorm(100, 1, 1)
yb <- rnorm(100, 0,1.5)
yc <- rnorm(100, 0, 2)
yd <- rnorm(100, -1, 2.5)

yy <- c(ya,yb,yc,yd) #so, now we've got data from 4 different groups. 
xx <- c(rep("a", 100), rep("b",100), rep("c",100),rep("d",100)) #groups
zz <- rnorm(400,0,1) #some other covariate

some.data= data.frame(yy,xx,zz)
model <- gam(yy ~ zz + s(xx, bs = "re"),data=some.data) #the model


# the intercept is the overall mean when zz is zero
summary(model)

 predictdata <- data.frame(zz = c(0,0,0,0), xx =c("a","b","c","d"))  #new data

#excluding random effects. Estimate should be the same for all and should be the intercept  
predict(model, newdata = predictdata, exclude = "s(xx)") 

#including random effects. Estimates should differ by group with 'a' larger and 'd' smaller
predict(model, newdata = predictdata) 

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM