自定义链接功能适用于GLM，但不适用于mgcv GAM

Question

Apologies if the answer is obvious but I've spent quite some time trying to use a custom link function in mgcv.gam 抱歉，如果答案很明显，但我花了很多时间尝试在mgcv.gam中使用自定义链接功能

In short, 简而言之，

I want to use a modified probit link from package psyphy ( I want to use psyphy.probit_2asym , I call it custom_link ) 我想使用包psyphy中的修改后的probit链接（我想使用psyphy.probit_2asym ，我称之为custom_link ）
I can create a {stats}family object with this link and use it in the 'family' argument of glm. 我可以使用此链接创建一个{stats}系列对象，并在glm的“family”参数中使用它。
m <- glm(y~x, family=binomial(link=custom_link), ... )
It does not work when used as an argument for {mgcv}gam 当用作{mgcv} gam的参数时，它不起作用
m <- gam(y~s(x), family=binomial(link=custom_link), ... )

I get the error Error in fix.family.link.family(family) : link not recognised 我Error in fix.family.link.family(family) : link not recognised收到错误Error in fix.family.link.family(family) : link not recognised

I do not get the reason for this error, both glm and gam work if I specify the standard link=probit . 我没有得到这个错误的原因，如果我指定标准link=probit ，glm和gam都会工作。

So my question can be summarized as: 所以我的问题可归纳为：

what is missing in this custom link that works for glm but not for gam? 这个自定义链接中缺少哪些适用于glm但不适用于gam？

Thanks in advance if you can give me a hint on what I should do. 如果你能给我一些关于我该做什么的提示，请提前致谢。

Link function 链接功能

probit.2asym <- function(g, lam) {
    if ((g < 0 ) || (g > 1))
        stop("g must in (0, 1)")
    if ((lam < 0) || (lam > 1))
        stop("lam outside (0, 1)")
    linkfun <- function(mu) {
        mu <- pmin(mu, 1 - (lam + .Machine$double.eps))
        mu <- pmax(mu, g + .Machine$double.eps)
        qnorm((mu - g)/(1 - g - lam))
        }
    linkinv <- function(eta) {
        g + (1 - g - lam) * 
         pnorm(eta)
        }
    mu.eta <- function(eta) {
        (1 - g - lam) * dnorm(eta)      }
    valideta <- function(eta) TRUE
    link <- paste("probit.2asym(", g, ", ", lam, ")", sep = "")
    structure(list(linkfun = linkfun, linkinv = linkinv, 
    mu.eta = mu.eta, valideta = valideta, name = link), 
    class = "link-glm")
}

Answer 1

As you may know, glm takes iteratively reweighted least squares fitting iterations. 如您所知， glm采用迭代重加权最小二乘拟合迭代。 Early version of gam extends this by fitting an iteratively penalized reweighted least squares , which is done by the gam.fit function. 早期版本的gam通过拟合迭代惩罚的重加权最小二乘来扩展这一点，这是由gam.fit函数完成的。 This is known as performance iteration in some context. 这在某些上下文中称为性能迭代 。

Since 2008 (or maybe slightly even earlier), gam.fit3 based on what is called outer iteration has replaced gam.fit as gam default. 自2008年以来（或者略微甚至更早）， gam.fit3基于所谓外迭代已经取代gam.fit为gam默认。 Such change does require some extra information of the family, regarding which you can read about ?fix.family.link . 这种变化确实需要一些关于家庭的额外信息，您可以阅读这些信息?fix.family.link 。

The major difference between two iterations is whether iteration of coefficients beta and iteration of smoothing parameters lambda are nested. 两次迭代之间的主要差异是系数beta迭代和平滑参数lambda迭代是否嵌套。

Performance iteration takes the nested way, where for each update of beta , a single iteration of lambda is performed; 性能迭代采用嵌套方式，每次更新beta ，执行单次lambda迭代;
Outer iteration completely separate those 2 iterations, where for each update of beta , iteration of lambda is carried to the end till convergence. 外部迭代完全分离了这两个迭代，其中对于beta每次更新， lambda迭代被带到最后直到收敛。

Obviously outer iteration is more stable and less likely to suffer from failure of convergence. 显然，外迭代更稳定，并且不太可能遭受收敛失败。

gam has an argument optimizer . gam有一个参数optimizer 。 By default it takes optimizer = c("outer", "newton") , that is the newton method of outer iteration; 默认情况下，它需要optimizer = c("outer", "newton") ，这是外部迭代的牛顿方法; but if you set optimizer = "perf" , it will take performance iteration. 但如果你设置optimizer = "perf" ，它将需要性能迭代。

So, after the above overview, we have two options: 因此，在上述概述之后，我们有两个选择：

still use outer iteration, but expand your customized link function; 仍然使用外部迭代，但扩展您的自定义链接功能;
use performance iteration to stay in line with glm . 使用性能迭代来保持与glm 。

I am being lazy so will demonstrate the second one (actually I am not feeling too confident to take the first approach) . 我很懒，所以会展示第二个（实际上我对第一种方法感觉不太自信） 。

Reproducible Example 可重复的例子

You did not provide a reproducible example, so I prepare one as below. 您没有提供可重复的示例，因此我准备如下。

set.seed(0)
x <- sort(runif(500, 0, 1))    ## covariates (sorted to make plotting easier)
eta <- -4 + 3 * x * exp(x) - 2 * log(x) * sqrt(x)   ## true linear predictor
p <- binomial(link = "logit")$linkinv(eta)    ## true probability (response)
y <- rbinom(500, 1, p)    ## binary observations

table(y)    ## a quick check that data are not skewed
#  0   1 
#271 229

I will take g = 0.1 and lam = 0.1 of the function probit.2asym you intend to use: 我将使用你想要使用的函数probit.2asym g = 0.1和lam = 0.1 ：

probit2 <- probit.2asym(0.1, 0.1)

par(mfrow = c(1,3))

## fit a glm with logit link
glm_logit <- glm(y ~ x, family = binomial(link = "logit"))
plot(x, eta, type = "l", main = "glm with logit link")
lines(x, glm_logit$linear.predictors, col = 2)

## glm with probit.2asym
glm_probit2 <- glm(y ~ x, family = binomial(link = probit2))
plot(x, eta, type = "l", main = "glm with probit2")
lines(x, glm_probit2$linear.predictors, col = 2)

## gam with probit.2aysm
library(mgcv)
gam_probit2 <- gam(y ~ s(x, bs = 'cr', k = 3), family = binomial(link = probit2),
                   optimizer = "perf")
plot(x, eta, type = "l", main = "gam with probit2")
lines(x, gam_probit2$linear.predictors, col = 2)

I have used natural cubic spline basis cr for s(x) , as for univariate smooth the default setting with thin-plate spline is unnecessary. 我使用s(x)自然三次样条基础cr ，对于单变量平滑，不需要使用薄板样条的默认设置。 I have also set a small basis dimension k = 3 (can't be smaller for a cubic spline) as my toy data is near linear and big basis dimension is not needed. 我还设置了一个小的基础维度k = 3 （对于三次样条曲线不能更小），因为我的玩具数据接近线性并且不需要大的基础尺寸。 More importantly, this seems to prevent convergence failure of performance iteration for my toy dataset. 更重要的是，这似乎可以防止我的玩具数据集的性能迭代收敛失败。

自定义链接功能适用于GLM，但不适用于mgcv GAM

问题描述

1 个解决方案

解决方案1
4 已采纳 2016-10-01 02:32:22

自定义链接功能适用于GLM，但不适用于mgcv GAM

问题描述

1 个解决方案

解决方案1 4 已采纳 2016-10-01 02:32:22

解决方案1
4 已采纳 2016-10-01 02:32:22