在R中，如何進行非線性最小二乘優化，包括求解微分方程？

Question

使用可重復的示例進行更新以說明我的問題

我最初的問題是“在R中實現信任區域反射優化算法”。 然而，在制作一個可重復的例子的路上（感謝@Ben的建議），我意識到我的問題是在Matlab中，一個函數lsqnonlin是好的（意味着不需要選擇一個好的起始值，快速）足夠大多數我有的情況，而在R中，沒有這樣的一對一功能。 不同的優化算法在不同情況下運行良好。 不同的算法達到了不同的解 這背后的原因可能不是R中的優化算法不如Matlab中的信賴域反射算法，它也可能與R如何處理自動微分有關。 這個問題實際上來自兩年前中斷的工作。 當時，封裝optimx的作者之一John C Nash教授提出Matlab的自動微分已經做了很多工作，這可能是Matlab lsqnonlin比優化函數表現更好的原因。 R中的算法/我無法用我的知識搞清楚。

下面的示例顯示了我遇到的一些問題（更多可重現的示例即將發布）。 要運行這些示例，首先運行install_github("KineticEval","zhenglei-gao") 。 您需要安裝包mkin及其依賴項，並且可能還需要為不同的優化算法安裝一堆其他包。

基本上我正在嘗試解決非線性最小二乘曲線擬合問題，如Matlab函數lsqnonlin的文檔（ http://www.mathworks.de/de/help/optim/ug/lsqnonlin.html ）中所述。 我的情況下的曲線由一組微分方程建模。 我將通過示例解釋一下。 我嘗試過的優化算法包括：

來自nls.lm Marq，Levenburg-Marquardt
來自nlm.inb端口
L-BGFS-B從optim
來自optimx
solnp包Rsolnp的

我也嘗試過其他幾個但沒有在這里展示。

我的問題摘要

R中是否存在可靠的函數/算法，如matlab中的lsqnonlin可以解決我的非線性最小二乘問題類型？ （我找不到一個。）
對於一個簡單的案例，不同的優化會達到不同的解決方案的原因是什么？
是什么讓lsqnonlin優於R中的函數？ 信任區域反射算法還是其他原因？
有沒有更好的方法來解決我的問題，制定不同的方法？ 也許有一個簡單的解決方案，但我只是不知道。

例1：一個簡單的案例

我先給出R代碼並稍后解釋。 例1的擬合圖

ex1 <- mkinmod.full(
  Parent = list(type = "SFO", to = "Metab", sink = TRUE,
                k = list(ini = 0.1,fixed = 0,lower = 0,upper = Inf),
                M0 = list(ini = 195, fixed = 0,lower = 0,upper = Inf),
                FF = list(ini = c(.1),fixed = c(0),lower = c(0),upper = c(1)),
                time=c(0.0,2.8,   6.2,  12.0,  29.2,  66.8,  99.8, 127.5, 154.4, 229.9, 272.3, 288.1, 322.9),
                residue = c( 157.3, 206.3, 181.4, 223.0, 163.2, 144.7,85.0, 76.5, 76.4, 51.5, 45.5, 47.3, 42.7),
                weight = c( 1,  1,   1, 1, 1,   1,  1,     1,     1,     1,     1,     1,     1)),
  Metab = list(type = "SFO",
               k = list(ini = 0.1,fixed = 0,lower = 0,upper = Inf),
              M0 = list(ini = 0, fixed = 1,lower = 0,upper = Inf),
                    residue =c( 0.0,  0.0,  0.0,  1.6,  4.0, 12.3, 13.5, 12.7, 11.4, 11.6, 10.9,  9.5,  7.6),
               weight = c( 1,  1,   1, 1, 1,   1,  1,     1,     1,     1,     1,     1,     1))
  )
ex1$diffs
Fit <- NULL
alglist <- c("L-BFGS-B","Marq", "Port","spg","solnp")
for(i in 1:5) {
  Fit[[i]] <- mkinfit.full(ex1,plot = TRUE, quiet= TRUE,ctr = kingui.control(method = alglist[i],submethod = 'Port',maxIter = 100,tolerance = 1E-06, odesolver = 'lsoda'))
  }
names(Fit) <- alglist
kinplot(Fit[[2]])
(lapply(Fit, function(x) x$par))
unlist(lapply(Fit, function(x) x$ssr))

最后一行的輸出是：

L-BFGS-B     Marq     Port      spg    solnp 
5735.744 4714.500 5780.446 5728.361 4714.499

除了“Marq”和“solnp”之外，其他算法都沒有達到最優。除此之外，'spg'方法（也像'bobyqa'這樣的其他方法）需要對這樣一個簡單的案例進行太多的函數評估。 此外，如果我改變起始值並使k_Parent=0.0058 （該參數的最佳值）而不是隨機選擇0.1 ，“Marq”再也找不到最佳值！ （下面提供的代碼）。 我也有數據集，其中“solnp”找不到最佳。 但是，如果我在Matlab中使用lsqnonlin ，我就沒有遇到過這種簡單案例的任何困難。

ex1_a <- mkinmod.full(
  Parent = list(type = "SFO", to = "Metab", sink = TRUE,
                k = list(ini = 0.0058,fixed = 0,lower = 0,upper = Inf),
                M0 = list(ini = 195, fixed = 0,lower = 0,upper = Inf),
                FF = list(ini = c(.1),fixed = c(0),lower = c(0),upper = c(1)),
                time=c(0.0,2.8,   6.2,  12.0,  29.2,  66.8,  99.8, 127.5, 154.4, 229.9, 272.3, 288.1, 322.9),
                residue = c( 157.3, 206.3, 181.4, 223.0, 163.2, 144.7,85.0, 76.5, 76.4, 51.5, 45.5, 47.3, 42.7),
                weight = c( 1,  1,   1, 1, 1,   1,  1,     1,     1,     1,     1,     1,     1)),
  Metab = list(type = "SFO",
               k = list(ini = 0.1,fixed = 0,lower = 0,upper = Inf),
              M0 = list(ini = 0, fixed = 1,lower = 0,upper = Inf),
                    residue =c( 0.0,  0.0,  0.0,  1.6,  4.0, 12.3, 13.5, 12.7, 11.4, 11.6, 10.9,  9.5,  7.6),
               weight = c( 1,  1,   1, 1, 1,   1,  1,     1,     1,     1,     1,     1,     1))
  )

Fit_a <- NULL
alglist <- c("L-BFGS-B","Marq", "Port","spg","solnp")
for(i in 1:5) {
  Fit_a[[i]] <- mkinfit.full(ex1_a,plot = TRUE, quiet= TRUE,ctr = kingui.control(method = alglist[i],submethod = 'Port',maxIter = 100,tolerance = 1E-06, odesolver = 'lsoda'))
  }
names(Fit_a) <- alglist
lapply(Fit_a, function(x) x$par)
unlist(lapply(Fit_a, function(x) x$ssr))

現在最后一行的輸出是：

L-BFGS-B     Marq     Port      spg    solnp 
5653.132 4866.961 5653.070 5635.372 4714.499

我將解釋我在這里優化的內容。 如果您運行了上述腳本並查看了曲線，我們使用帶有一階反應的兩室模型來描述曲線。 表達模型的微分方程在ex1$diffs ：

                                                             Parent 
                                    "d_Parent = - k_Parent * Parent" 
                                                               Metab 
"d_Metab = - k_Metab * Metab + k_Parent * f_Parent_to_Metab * Parent"

對於這個簡單的情況，從微分方程我們可以導出方程來描述兩條曲線。 要優化的參數是 $M_0,k_p, k_m, c=\\mbox{FF_parent_to_Met} $ ，約束 $M_0>0,k_p>0, k_m>0, 1> c >0$ 。

$$
\begin{split}
            y_{1j}&= M_0e^{-k_pt_i}+\epsilon_{1j}\\
            y_{2j} &= cM_0k_p\frac{e^{-k_mt_i}-e^{-k_pt_i}}{k_p-k_m}+\epsilon_{2j}
            \end{split}
$$

因此，我們可以擬合曲線而無需求解微分方程。

BCS1.l <- mkin_wide_to_long(BCS1)
BCS1.l <- na.omit(BCS1.l)
indi <- c(rep(1,sum(BCS1.l$name=='Parent')),rep(0,sum(BCS1.l$name=='Metab')))
sysequ.indi <- function(t,indi,M0,kp,km,C)
  {
    y <- indi*M0*exp(-kp*t)+(1-indi)*C*M0*kp/(kp-km)*(exp(-km*t)-exp(-kp*t));
    y
  }
M00 <- 100
kp0 <- 0.1
km0 <- 0.01
C0 <- 0.1
library(nlme)
result1 <- gnls(value ~ sysequ.indi(time,indi,M0,kp,km,C),data=BCS1.l,start=list(M0=M00,kp=kp0,km=km0,C=C0),control=gnlsControl())
#result3 <- gnls(value ~ sysequ.indi(time,indi,M0,kp,km,C),data=BCS1.l,start=list(M0=M00,kp=kp0,km=km0,C=C0),weights = varIdent(form=~1|name))
## Coefficients:
##          M0           kp           km            C 
## 1.946170e+02 5.800074e-03 8.404269e-03 2.208788e-01

這樣，經過的時間幾乎為0，達到最佳值。 但是，我們並不總是有這個簡單的案例。 該模型可能很復雜，需要求解微分方程。 見例2

例2，一個復雜的模型

我很久以前就研究過這個數據集，沒有時間自己完成下面的腳本運行。 （您可能需要幾個小時才能完成運行。） 適合實例的情節

data(BCS2)
ex2 <- mkinmod.full(Parent= list(type = "SFO",to = c( "Met1", "Met2","Met4", "Met5"),
                                 k = list(ini = 0.1,fixed = 0,lower = 0,upper = Inf),
                                 M0 = list(ini = 100,fixed = 0,lower = 0,upper = Inf),
                                 FF = list(ini = c(.1,.1,.1,.1),fixed = c(0,0,0,0),lower = c(0,0,0,0),upper = c(1,1,1,1))),
                    Met1 = list(type = "SFO",to = c("Met3", "Met4")),
                    Met2 = list(type = "SFO",to = c("Met3")),
                    Met3 = list(type = "SFO" ),
                    Met4 = list(type = "SFO", to = c("Met5")),
                    Met5 = list(type = "SFO"),
                    data=BCS2)
ex2$diffs
Fit2 <- NULL
alglist <- c("L-BFGS-B","Marq", "Port","spg","solnp")
for(i in 1:5) {
  Fit2[[i]] <- mkinfit.full(ex2,plot = TRUE, quiet= TRUE,ctr = kingui.control(method = alglist[i],submethod = 'Port',maxIter = 100,tolerance = 1E-06, odesolver = 'lsoda'))
  }
names(Fit) <- alglist
(lapply(Fit, function(x) x$par))
unlist(lapply(Fit, function(x) x$ssr))

這是一個示例，您將看到如下警告消息：

DLSODA-  At T (=R1) and step size H (=R2), the    
  corrector convergence failed repeatedly     
  or with ABS(H) = HMIN   
In above message, R = 
[1] 0.000000e+00 2.289412e-09

原來的問題

Matlab Optimization Toolbox求解器中使用的許多方法都基於信任區域。 根據CRAN任務視圖頁面，只有包信任， trustOptim ， minqa具有基於信任區域的方法。 但是， trust和trustOptim需要漸變和粗麻布。 bobyqa中的bobyqa似乎不是我要找的那個。 根據我的個人經驗，Matlab中的信任區域反射算法與我在R中嘗試的算法相比通常表現得更好。所以我試圖在R中找到類似的算法實現。

我在這里問了一個相關的問題： R函數來搜索一個函數

Matthew Plourde提供的答案在Matlab中給出了具有相同函數名的函數lsqnonlin ，但沒有實現信任區域反射算法。 我編輯了舊問題，並在這里提出了一個新問題，因為我認為Matthew Plourde的答案通常對正在尋找功能的R用戶非常有幫助。

我再次進行搜索，沒有運氣。 是否還有一些函數/包實現了類似的matlab函數。 如果沒有，我想知道是否允許我將Matlab函數直接轉換為R並將其用於我自己的目的。

Answer 1

一般來說，只看你問題的標題時，我建議只使用FME包。 但這不是您的問題的重點，成功可能取決於您設置模型的方式。

對於您在示例中顯示的問題類型（使用多個轉換產品擬合降級數據），我為此類問題創建了mkin包作為FME的便利包裝器。 那么讓我們看看mkin 0.9-29在這些情況下的表現如何。 使用mkin，您只能使用FME提供的算法：

例1

library(mkin)

ex1_data_wide = data.frame(
  time= c(0.0, 2.8, 6.2, 12.0, 29.2, 66.8, 99.8, 127.5, 154.4, 229.9, 272.3, 288.1, 322.9),
  Parent = c(157.3, 206.3, 181.4, 223.0, 163.2, 144.7,85.0, 76.5, 76.4, 51.5, 45.5, 47.3, 42.7),
  Metab = c(0.0, 0.0, 0.0, 1.6, 4.0, 12.3, 13.5, 12.7, 11.4, 11.6, 10.9, 9.5, 7.6))

ex1_data = mkin_wide_to_long(ex1_data_wide, time = "time")

ex1_model = mkinmod(Parent = list(type = "SFO", to = "Metab"),
                    Metab = list(type = "SFO"))

algs = c("L-BFGS-B", "Marq", "Port")

times_ex1 <- list()
fits_ex1 <- list()
for (alg in algs) {
  times_ex1[[alg]] <- system.time(fits_ex1[[alg]] <- mkinfit(ex1_model, ex1_data,
                                                             method.modFit = alg))
}

times_ex1
unlist(lapply(fits_ex1, function(x) x$ssr))

所以Levenberg-Marquardt和nls.lm以及Port算法都找到了你的最小值，LM更快：

$`L-BFGS-B`
       User      System verstrichen 
      2.036       0.000       2.051 

$Marq
       User      System verstrichen 
      0.716       0.000       0.714 

$Port
       User      System verstrichen 
      2.032       0.000       2.030 

L-BFGS-B     Marq     Port 
5742.312 4714.498 4714.498

當我告訴mkin使用形成分數而不僅僅是速率

ex1_model = mkinmod(Parent = list(type = "SFO", to = "Metab"),
                    Metab = list(type = "SFO"), use_of_ff = "max")

並使用你的起始值，

for (alg in algs) {
  times_ex1[[alg]] <- system.time(fits_ex1[[alg]] <- mkinfit(ex1_model, ex1_data,
    state.ini = c(195, 0),
    parms.ini = c(f_Parent_to_Metab = 0.1, k_Parent = 0.0058, k_Metab = 0.1),
    method.modFit = alg))
}

所有三種算法都能找到相同的解決方案，甚至更快 但是，如果我關閉mkinfit調用中的速率和分數的轉換（ transform_rates = FALSE, transform_fractions = FALSE ），我得到

L-BFGS-B     Marq     Port 
5653.132 4714.498 5653.070

因此它似乎與參數內部轉換的方式有關（當你給出邊界時，FME也會這樣做）。 在mkin中，我進行了顯式的內部參數轉換，因此使用默認設置的優化參數不需要邊界。

例2

library(mkin)
library(KineticEval) # for the dataset BCS2
data(BCS2)

ex2_data = mkin_wide_to_long(BCS2, time = "time")

ex2_model = mkinmod(Parent = list(type = "SFO", to = paste0("Met", 1:5)),
                    Met1 = list(type = "SFO", to = c("Met3", "Met4")),
                    Met2 = list(type = "SFO", to = "Met3"),
                    Met3 = list(type = "SFO"),
                    Met4 = list(type = "SFO", to = "Met5"),
                    Met5 = list(type = "SFO"))

times_ex2 <- list()
fits_ex2 <- list()

for (alg in algs) {
  times_ex2[[alg]] <- system.time(fits_ex2[[alg]] <- mkinfit(ex2_model, ex2_data,
    method.modFit = alg))
}   

times_ex2
unlist(lapply(fits_ex2, function(x) x$ssr))

同樣，LM是最快的，但最低的最低值是由Port找到的：

$`L-BFGS-B`
       User      System verstrichen 
     75.728       0.004      75.653 

$Marq
       User      System verstrichen 
      6.440       0.004       6.436 

$Port
       User      System verstrichen 
     51.200       0.028      51.180 

L-BFGS-B     Marq     Port 
485.3099 572.9635 478.4379

我以前總是推薦LM，但最近我也發現它有時會陷入局部最小值，這取決於錯誤定義參數的起始值。 一個例子是Schaefer 07數據，在mkin包中的mkinfit的最后一個單元測試中被處理，稱為test.mkinfit.schaefer07_complex_example 。

希望這是有用的，親切的問候，

約翰內斯

PS：當我注意到你在github上的KineticEval包中添加了一個信任區域反射優化的純R實現作為函數lsqnonlin（）時，我發現了這個問題，我正在搜索信任區域反射。

在R中，如何進行非線性最小二乘優化，包括求解微分方程？

問題描述

使用可重復的示例進行更新以說明我的問題

我的問題摘要

例1：一個簡單的案例

例2，一個復雜的模型

原來的問題

1 個解決方案

解決方案1
1 2014-07-04 13:44:47

例1

例2

在R中，如何進行非線性最小二乘優化，包括求解微分方程？

問題描述

使用可重復的示例進行更新以說明我的問題

我的問題摘要

例1：一個簡單的案例

例2，一個復雜的模型

原來的問題

1 個解決方案

解決方案1 1 2014-07-04 13:44:47

例1

例2

解決方案1
1 2014-07-04 13:44:47