[英]Nonlinear Least Squares in R - Levenberg Marquardt to Fit Heligman Pollard Model Parameters
I am attempting to reproduce the solutions of paper by Kostakis. 我试图重现Kostakis的纸张解决方案。 In this paper an abridged mortality table is expanded to a complete life table using de Heligman-Pollard model.
在本文中,使用de Heligman-Pollard模型将删节死亡率表扩展为完整的生命表。 The model has 8 parameters which have to be fitted.
该模型有8个参数必须安装。 The author used a modified Gauss-Newton algorithm;
作者使用了改进的Gauss-Newton算法; this algorithm (E04FDF) is part of the NAG library of computer programs.
该算法(E04FDF)是NAG计算机程序库的一部分。 Should not Levenberg Marquardt yield the same set of parameters?
Levenberg Marquardt不应该产生相同的参数集吗? What is wrong with my code or application of the LM algorithm?
我的代码或LM算法的应用有什么问题?
library(minpack.lm)
## Heligman-Pollard is used to expand an abridged table.
## nonlinear least squares algorithm is used to fit the parameters on nqx observed over 5 year intervals (5qx)
AGE <- c(0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70)
MORTALITY <- c(0.010384069, 0.001469140, 0.001309318, 0.003814265, 0.005378395, 0.005985625, 0.006741766, 0.009325056, 0.014149626, 0.021601755, 0.034271934, 0.053836246, 0.085287751, 0.136549522, 0.215953304)
## The start parameters for de Heligman-Pollard Formula (Converged set a=0.0005893,b=0.0043836,c=0.0828424,d=0.000706,e=9.927863,f=22.197312,g=0.00004948,h=1.10003)
## I modified a random parameter "a" in order to have a start values. The converged set is listed above.
parStart <- list(a=0.0008893,b=0.0043836,c=0.0828424,d=0.000706,e=9.927863,f=22.197312,g=0.00004948,h=1.10003)
## The Heligman-Pollard Formula (HP8) = qx/px = ...8 parameter equation
HP8 <-function(parS,x)
ifelse(x==0, parS$a^((x+parS$b)^parS$c) + parS$g*parS$h^x,
parS$a^((x+parS$b)^parS$c) + parS$d*exp(-parS$e*(log(x/parS$f))^2) +
parS$g*parS$h^x)
## Define qx = HP8/(1+HP8)
qxPred <- function(parS,x) HP8(parS,x)/(1+HP8(parS,x))
## Calculate nqx predicted by HP8 model (nqxPred(parStart,x))
nqxPred <- function(parS,x)
(1 -(1-qxPred(parS,x)) * (1-qxPred(parS,x+1)) *
(1-qxPred(parS,x+2)) * (1-qxPred(parS,x+3)) *
(1-qxPred(parS,x+4)))
##Define Residual Function, the relative squared distance is minimized
ResidFun <- function(parS, Observed,x) (nqxPred(parS,x)/Observed-1)^2
## Applying the nls.lm algo.
nls.out <- nls.lm(par=parStart, fn = ResidFun, Observed = MORTALITY, x = AGE,
control = nls.lm.control(nprint=1,
ftol = .Machine$double.eps,
ptol = .Machine$double.eps,
maxfev=10000, maxiter = 500))
summary(nls.out)
## The author used a modified Gauss-Newton algorithm, this alogorithm (E04FDF) is part of the NAG library of computer programs
## Should not Levenberg Marquardt yield the same set of parameters
The bottom line here is that @Roland is absolutely right, this is a very ill-posed problem, and you shouldn't necessarily expect to get reliable answers. 这里的底线是@Roland是绝对正确的,这是一个非常不适合的问题,你不一定希望得到可靠的答案。 Below I've
我在下面
ResidFun
to return residuals, not squared residuals. ResidFun
更改为返回残差,而不是残差平方。 (The former is correct, but this doesn't make very much difference.) Load package: 加载包:
library(minpack.lm)
Data, as a data frame: 数据,作为数据框:
d <- data.frame(
AGE = seq(0,70,by=5),
MORTALITY=c(0.010384069, 0.001469140, 0.001309318, 0.003814265,
0.005378395, 0.005985625, 0.006741766, 0.009325056,
0.014149626, 0.021601755, 0.034271934, 0.053836246,
0.085287751, 0.136549522, 0.215953304))
First view of the data: 首先查看数据:
library(ggplot2)
(g1 <- ggplot(d,aes(AGE,MORTALITY))+geom_point())
g1+geom_smooth() ## with loess fit
Parameter choices: 参数选择:
Presumably these are the parameters from the original paper ... 据推测这些是原始论文中的参数......
parConv <- c(a=0.0005893,b=0.0043836,c=0.0828424,
d=0.000706,e=9.927863,f=22.197312,g=0.00004948,h=1.10003)
Perturbed parameters: 扰动参数:
parStart <- parConv
parStart["a"] <- parStart["a"]+3e-4
The formulae: 公式:
HP8 <-function(parS,x)
with(as.list(parS),
ifelse(x==0, a^((x+b)^c) + g*h^x,
a^((x+b)^c) + d*exp(-e*(log(x/f))^2) + g*h^x))
## Define qx = HP8/(1+HP8)
qxPred <- function(parS,x) {
h <- HP8(parS,x)
h/(1+h)
}
## Calculate nqx predicted by HP8 model (nqxPred(parStart,x))
nqxPred <- function(parS,x)
(1 -(1-qxPred(parS,x)) * (1-qxPred(parS,x+1)) *
(1-qxPred(parS,x+2)) * (1-qxPred(parS,x+3)) *
(1-qxPred(parS,x+4)))
##Define Residual Function, the relative squared distance is minimized
ResidFun <- function(parS, Observed,x) (nqxPred(parS,x)/Observed-1)
nb this is changed slightly from the OP's version; 这是从OP的版本略有改变;
nls.lm
wants residuals, not squared residuals. nls.lm
想要残差,而不是残差平方。
A sum-of-squares function for use with other optimizers: 与其他优化器一起使用的平方和函数:
ssqfun <- function(parS, Observed, x) {
sum(ResidFun(parS, Observed, x)^2)
}
Applying nls.lm
. 应用
nls.lm
(Not sure why ftol
and ptol
were lowered from sqrt(.Machine$double.eps)
to .Machine$double.eps
-- the former is generally a practical limit to precision ... (不确定为什么
ftol
和ptol
从sqrt(.Machine$double.eps)
降低到.Machine$double.eps
- 前者通常是精确度的实际限制......
nls.out <- nls.lm(par=parStart, fn = ResidFun,
Observed = d$MORTALITY, x = d$AGE,
control = nls.lm.control(nprint=0,
ftol = .Machine$double.eps,
ptol = .Machine$double.eps,
maxfev=10000, maxiter = 1000))
parNLS <- coef(nls.out)
pred0 <- nqxPred(as.list(parConv),d$AGE)
pred1 <- nqxPred(as.list(parNLS),d$AGE)
dPred <- with(d,rbind(data.frame(AGE,MORTALITY=pred0,w="conv"),
data.frame(AGE,MORTALITY=pred1,w="nls")))
g1 + geom_line(data=dPred,aes(colour=w))
The lines are indistinguishable, but the parameters have some big differences: 线条难以区分,但参数有一些很大的差异:
round(cbind(parNLS,parConv),5)
## parNLS parConv
## a 1.00000 0.00059
## b 50.46708 0.00438
## c 3.56799 0.08284
## d 0.00072 0.00071
## e 6.05200 9.92786
## f 21.82347 22.19731
## g 0.00005 0.00005
## h 1.10026 1.10003
d,f,g,h are close, but a,b,c are orders of magnitude different and e is 50% different. d,f,g,h接近,但a,b,c是不同的数量级,e是50%不同。
Looking at the original equations, what's happening here is that a^((x+b)^c)
is getting set to a constant, because a
is approaching 1: once a
is approximately 1, b
and c
are essentially irrelevant. 看看原始方程,这里发生的是
a^((x+b)^c)
被设置为常数,因为a
接近1:一旦a
大约为1, b
和c
基本上是无关的。
Let's check the correlation (we need a generalized inverse because the matrix is so strongly correlated): 让我们检查相关性(我们需要一个广义逆,因为矩阵是如此强相关):
obj <- nls.out
vcov <- with(obj,deviance/(length(fvec) - length(par)) *
MASS::ginv(hessian))
cmat <- round(cov2cor(vcov),1)
dimnames(cmat) <- list(letters[1:8],letters[1:8])
## a b c d e f g h
## a 1.0 0.0 0.0 0.0 0.0 0.0 -0.1 0.0
## b 0.0 1.0 -1.0 1.0 -1.0 -1.0 -0.4 -1.0
## c 0.0 -1.0 1.0 -1.0 1.0 1.0 0.4 1.0
## d 0.0 1.0 -1.0 1.0 -1.0 -1.0 -0.4 -1.0
## e 0.0 -1.0 1.0 -1.0 1.0 1.0 0.4 1.0
## f 0.0 -1.0 1.0 -1.0 1.0 1.0 0.4 1.0
## g -0.1 -0.4 0.4 -0.4 0.4 0.4 1.0 0.4
## h 0.0 -1.0 1.0 -1.0 1.0 1.0 0.4 1.0
This is not actually so useful -- it really just confirms that lots of the variables are strongly correlated ... 这实际上并没有那么有用 - 它确实只是证实了很多变量是强相关的......
library(optimx)
mvec <- c('Nelder-Mead','BFGS','CG','L-BFGS-B',
'nlm','nlminb','spg','ucminf')
opt1 <- optimx(par=parStart, fn = ssqfun,
Observed = d$MORTALITY, x = d$AGE,
itnmax=5000,
method=mvec,control=list(kkt=TRUE))
## control=list(all.methods=TRUE,kkt=TRUE)) ## Boom!
## fvalues method fns grs itns conv KKT1 KKT2 xtimes
## 2 8.988466e+307 BFGS NA NULL NULL 9999 NA NA 0
## 3 8.988466e+307 CG NA NULL NULL 9999 NA NA 0
## 4 8.988466e+307 L-BFGS-B NA NULL NULL 9999 NA NA 0
## 5 8.988466e+307 nlm NA NA NA 9999 NA NA 0
## 7 0.3400858 spg 1 NA 1 3 NA NA 0.064
## 8 0.3400858 ucminf 1 1 NULL 0 NA NA 0.032
## 1 0.06099295 Nelder-Mead 501 NA NULL 1 NA NA 0.252
## 6 0.009275733 nlminb 200 1204 145 1 NA NA 0.708
This warns about bad scaling, and also finds a variety of different answers: only ucminf
claims to have converged, but nlminb
gets a better answer -- and the itnmax
parameter seems to be ignored ... 这警告了不良的缩放,并且还发现了各种不同的答案:只有
ucminf
声称已经收敛,但是nlminb
得到了更好的答案 - 并且itnmax
参数似乎被忽略了......
opt2 <- nlminb(start=parStart, objective = ssqfun,
Observed = d$MORTALITY, x = d$AGE,
control= list(eval.max=5000,iter.max=5000))
parNLM <- opt2$par
Finishes, but with a false convergence warning ... 完成,但有一个虚假的收敛警告......
round(cbind(parNLS,parConv,parNLM),5)
## parNLS parConv parNLM
## a 1.00000 0.00059 1.00000
## b 50.46708 0.00438 55.37270
## c 3.56799 0.08284 3.89162
## d 0.00072 0.00071 0.00072
## e 6.05200 9.92786 6.04416
## f 21.82347 22.19731 21.82292
## g 0.00005 0.00005 0.00005
## h 1.10026 1.10003 1.10026
sapply(list(parNLS,parConv,parNLM),
ssqfun,Observed=d$MORTALITY,x=d$AGE)
## [1] 0.006346250 0.049972367 0.006315034
It looks like nlminb
and minpack.lm
are getting similar answers, and are actually doing better than the originally stated parameters (by quite a bit): 看起来
nlminb
和minpack.lm
得到了类似的答案,并且实际上比最初声明的参数做得更好 (相当多):
pred2 <- nqxPred(as.list(parNLM),d$AGE)
dPred <- with(d,rbind(dPred,
data.frame(AGE,MORTALITY=pred2,w="nlminb")))
g1 + geom_line(data=dPred,aes(colour=w))
ggsave("cmpplot.png")
ggplot(data=dPred,aes(x=AGE,y=MORTALITY-d$MORTALITY,colour=w))+
geom_line()+geom_point(aes(shape=w),alpha=0.3)
ggsave("residplot.png")
Other things one could try would be: 其他可以尝试的事情是:
slice
function from bbmle
to explore whether the old and new parameters seem to represent distinct minima, or whether the old parameters are just a false convergence ... bbmle
的slice
函数来探索旧参数和新参数是否代表不同的最小值,或者旧参数是否只是一个错误的收敛... optimx
or related packages working for similar checks optimx
或相关包的KKT(Karsh-Kuhn-Tucker)标准计算器,用于类似的检查 PS: the largest deviations (by far) are for the oldest age classes, which probably also have small samples. PS:最大的偏差(到目前为止)是最老的年龄组,可能也有小样本。 From a statistical point of view it would probably be worth doing a fit that weighted by the precision of the individual points ...
从统计学的角度来看,可能值得做一个由各个点的精度加权的拟合...
@BenBolker, fitting the parameters with the entire dataset (underlying qx) values. @BenBolker,使用整个数据集(基础qx)值拟合参数。 Still not able to reproduce parameters
仍然无法重现参数
library(minpack.lm)
library(ggplot2)
library(optimx)
getwd()
d <- data.frame(AGE = seq(0,74), MORTALITY=c(869,58,40,37,36,35,32,28,29,23,24,22,24,28,
33,52,57,77,93,103,103,109,105,114,108,112,119,
125,117,127,125,134,134,131,152,179,173,182,199,
203,232,245,296,315,335,356,405,438,445,535,594,
623,693,749,816,915,994,1128,1172,1294,1473,
1544,1721,1967,2129,2331,2559,2901,3203,3470,
3782,4348,4714,5245,5646))
d$MORTALITY <- d$MORTALITY/100000
ggplot(d,aes(AGE,MORTALITY))+geom_point()
##Not allowed to post Images
g1 <- ggplot(d,aes(AGE,MORTALITY))+geom_point()
g1+geom_smooth()## with loess fit
Reported Parameters: 报告参数:
parConv <- c(a=0.0005893,b=0.0043836,c=0.0828424,d=0.000706,e=9.927863,f=22.197312,
g=0.00004948,h=1.10003)
parStart <- parConv
parStart["a"] <- parStart["a"]+3e-4
## Define qx = HP8/(1+HP8)
HP8 <-function(parS,x)
with(as.list(parS),
ifelse(x==0, a^((x+b)^c) + g*h^x, a^((x+b)^c) + d*exp(-e*(log(x/f))^2) + g*h^x))
qxPred <- function(parS,x) {
h <- HP8(parS,x)
h/(1+h)
}
##Define Residual Function, the relative squared distance is minimized,
ResidFun <- function(parS, Observed,x) (qxPred(parS,x)/Observed-1)
ssqfun <- function(parS, Observed, x) {
sum(ResidFun(parS, Observed, x)^2)
}
nls.out <- nls.lm(par=parStart, fn = ResidFun, Observed = d$MORTALITY, x = d$AGE,
control = nls.lm.control(nprint=1, ftol = sqrt(.Machine$double.eps),
ptol = sqrt(.Machine$double.eps), maxfev=1000, maxiter=1000))
parNLS <- coef(nls.out)
pred0 <- qxPred(as.list(parConv),d$AGE)
pred1 <- qxPred(as.list(parNLS),d$AGE)
#Binds Row wise the dataframes from pred0 and pred1
dPred <- with(d,rbind(data.frame(AGE,MORTALITY=pred0,w="conv"),
data.frame(AGE,MORTALITY=pred1,w="nls")))
g1 + geom_line(data=dPred,aes(colour=w))
round(cbind(parNLS,parConv),7)
mvec <- c('Nelder-Mead','BFGS','CG','L-BFGS-B','nlm','nlminb','spg','ucminf')
opt1 <- optimx(par=parStart, fn = ssqfun,
Observed = d$MORTALITY, x = d$AGE,
itnmax=5000,
method=mvec, control=list(all.methods=TRUE,kkt=TRUE,)
## control=list(all.methods=TRUE,kkt=TRUE)) ## Boom
get.result(opt1, attribute= c("fvalues","method", "grs", "itns",
"conv", "KKT1", "KKT2", "xtimes"))
## method fvalues grs itns conv KKT1 KKT2 xtimes
##5 nlm 8.988466e+307 NA NA 9999 NA NA 0
##4 L-BFGS-B 8.988466e+307 NULL NULL 9999 NA NA 0
##2 CG 8.988466e+307 NULL NULL 9999 NA NA 0.02
##1 BFGS 8.988466e+307 NULL NULL 9999 NA NA 0
##3 Nelder-Mead 0.5673864 NA NULL 0 NA NA 0.42
##6 nlminb 0.4127198 546 62 0 NA NA 0.17
opt2 <- nlminb(start=parStart, objective = ssqfun,
Observed = d$MORTALITY, x = d$AGE,
control= list(eval.max=5000,iter.max=5000))
parNLM <- opt2$par
Check on parameters: 检查参数:
round(cbind(parNLS,parConv,parNLM),5)
## parNLS parConv parNLM
##a 0.00058 0.00059 0.00058
##b 0.00369 0.00438 0.00369
##c 0.08065 0.08284 0.08065
##d 0.00070 0.00071 0.00070
##e 9.30948 9.92786 9.30970
##f 22.30769 22.19731 22.30769
##g 0.00005 0.00005 0.00005
##h 1.10084 1.10003 1.10084
SSE Review: SSE评论:
sapply(list(parNLS,parConv,parNLM),
ssqfun,Observed=d$MORTALITY,x=d$AGE)
##[1] 0.4127198 0.4169513 0.4127198
Not able to upload graphs but the code is here. 无法上传图表,但代码在这里。 Still appears that the parameters found in the article are not the best fit when the complete mortality data (not abridged or subset) is used
仍然看来,当使用完整的死亡率数据(未删节或子集)时,文章中找到的参数不是最佳拟合
##pred2 <- qxPred(as.list(parNLM),d$AGE)
##dPred <- with(d,rbind(dPred,
data.frame(AGE,MORTALITY=pred2,w="nlminb")))
##g1 + geom_line(data=dPred,aes(colour=w))
ggplot(data=dPred,aes(x=AGE,y=MORTALITY-d$MORTALITY,colour=w))
+ geom_line()+geom_point(aes(shape=w),alpha=0.3)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.