python 中帶有 gekko 的 MLE 應用程序

Question

我想在 python 中使用gekko package 實現MLE（最大似然估計） 。假設我們有一個包含兩列的DataFrame ：['Loss', 'Target'] 並且它的長度等於 500。
首先我們必須導入我們需要的包：

from gekko import GEKKO
import numpy as np
import pandas as pd

然后我們像這樣簡單地創建DataFrame ：

My_DataFrame = pd.DataFrame({"Loss":np.linspace(-555.795 , 477.841 , 500) , "Target":0.0})
My_DataFrame = My_DataFrame.sort_values(by=["Loss"] , ascending=False).reset_index(drop=True)
My_DataFrame

它看起來像這樣：

['Target'] 列的某些組件應該使用我在圖片下方寫下的公式計算（其中 rest 保持為零。我在繼續中解釋了更多，請繼續閱讀）以便您可以完美地看到它. 配方的兩個主要元素是“Kasi”和“Betaa”。 我想為他們找到最大化My_DataFrame['Target']之和的最佳價值。 所以你知道了會發生什么！

現在讓我向您展示我是如何為此目的編寫代碼的。 首先我定義我的目標 function：

def obj_function(Array):
    """
    [Purpose]:
        + it will calculate each component of My_DataFrame["Target"] column! then i can maximize sum(My_DataFrame["Target"]) and find best 'Kasi' and 'Betaa' for it!
    
    [Parameters]:
        + This function gets Array that contains 'Kasi' and 'Betaa'.
        Array[0] represents 'Kasi' and Array[1] represents 'Betaa'

    [returns]:
        + returns a pandas.series.
        actually it returns new components of My_DataFrame["Target"]
    """
    # in following code if you don't know what is `qw`, just look at the next code cell right after this cell (I mean next section).
    # in following code np.where(My_DataFrame["Loss"] == item)[0][0] is telling me the row's index of item. 
    for item in My_DataFrame[My_DataFrame["Loss"]>160]['Loss']:
        My_DataFrame.iloc[np.where(My_DataFrame["Loss"] == item)[0][0] , 1] = qw.log10((1/Array[1])*(  1 + (Array[0]*(item-160)/Array[1])**( (-1/Array[0]) - 1 )))

    return My_DataFrame["Target"]

如果你對obj_function function 中的for loop發生了什么感到困惑，請查看下面的圖片，它包含一個簡短的示例，如果沒有：跳過這部分：

那么我們只需要 go 通過優化。 為此，我使用gekko package。 請注意，我想找到 'Kasi' 和 'Betaa' 的最佳值，所以我有兩個主要變量並且我沒有任何類型的約束：所以讓我們開始吧：

# i have 2 variables : 'Kasi' and 'Betaa', so I put nd=2
nd = 2
qw = GEKKO()

# now i want to specify my variables ('Kasi'  and 'Betaa') with initial values --> Kasi = 0.7 and Betaa = 20.0
x = qw.Array(qw.Var , nd , value = [0.7 , 20])
# So i guess now x[0] represents 'Kasi' and x[1] represents 'Betaa'

qw.Maximize(np.sum(obj_function(x)))

然后當我想用qw.solve()解決優化問題時：

qw.solve()

但是我得到了這個錯誤：

例外：此穩態 IMODE 僅允許標量值。

我該如何解決這個問題？ （為方便起見，下一節收集了完整的腳本）

from gekko import GEKKO
import numpy as np
import pandas as pd


My_DataFrame = pd.DataFrame({"Loss":np.linspace(-555.795 , 477.841 , 500) , "Target":0.0})
My_DataFrame = My_DataFrame.sort_values(by=["Loss"] , ascending=False).reset_index(drop=True)

def obj_function(Array):
    """
    [Purpose]:
        + it will calculate each component of My_DataFrame["Target"] column! then i can maximize sum(My_DataFrame["Target"]) and find best 'Kasi' and 'Betaa' for it!
    
    [Parameters]:
        + This function gets Array that contains 'Kasi' and 'Betaa'.
        Array[0] represents 'Kasi' and Array[1] represents 'Betaa'

    [returns]:
        + returns a pandas.series.
        actually it returns new components of My_DataFrame["Target"]
    """
    # in following code if you don't know what is `qw`, just look at the next code cell right after this cell (I mean next section).
    # in following code np.where(My_DataFrame["Loss"] == item)[0][0] is telling me the row's index of item. 
    for item in My_DataFrame[My_DataFrame["Loss"]>160]['Loss']:
        My_DataFrame.iloc[np.where(My_DataFrame["Loss"] == item)[0][0] , 1] = qw.log10((1/Array[1])*(  1 + (Array[0]*(item-160)/Array[1])**( (-1/Array[0]) - 1 )))

    return My_DataFrame["Target"]



# i have 2 variables : 'Kasi' and 'Betaa', so I put nd=2
nd = 2
qw = GEKKO()

# now i want to specify my variables ('Kasi'  and 'Betaa') with initial values --> Kasi = 0.7 and Betaa = 20.0
x = qw.Array(qw.Var , nd)
for i,xi in enumerate([0.7, 20]):
   x[i].value = xi
# So i guess now x[0] represents 'Kasi' and x[1] represents 'Betaa'

qw.Maximize(qw.sum(obj_function(x)))

建議的潛在腳本在這里：

from gekko import GEKKO
import numpy as np
import pandas as pd


My_DataFrame = pd.read_excel("[<FILE_PATH_IN_YOUR_MACHINE>]\\Losses.xlsx")
# i'll put link of "Losses.xlsx" file in the end of my explaination
# so you can download it from my google drive.


loss = My_DataFrame["Loss"]
def obj_function(x):
    k,b = x
    target = []

    for iloss in loss:
        if iloss>160:
            t = qw.log((1/b)*(1+(k*(iloss-160)/b)**((-1/k)-1)))
            target.append(t)
    return target


qw = GEKKO(remote=False)
nd = 2
x = qw.Array(qw.Var,nd)

# initial values --> Kasi = 0.7 and Betaa = 20.0
for i,xi in enumerate([0.7, 20]):
   x[i].value = xi
   
# bounds
k,b = x
k.lower=0.1; k.upper=0.8
b.lower=10;  b.upper=500
qw.Maximize(qw.sum(obj_function(x)))
qw.options.SOLVER = 1
qw.solve()
print('k = ',k.value[0])
print('b = ',b.value[0])

python output：

目標 function = -1155.4861315885942
b = 500.0
k = 0.1

請注意，在 python output 中， b代表“Betaa”， k代表“Kasi”。
output 看起來有點奇怪，所以我決定測試一下！ 為此，我使用了Microsoft Excel 求解器！
（我把excel文件的鏈接放在了我的解釋的最后，所以如果你想的話，你可以自己看看。）如下圖所示，excel的優化已經完成，並且已經成功找到最佳解決方案（見圖下面是優化結果）。

excel output：

目標 function = -108.21
貝塔 = 32.53161
卡西 = 0.436246

如您所見， python output和excel output之間存在巨大差異，似乎excel的表現相當不錯！ 所以我想問題仍然存在，建議 python 腳本表現不佳......
Optimization by Microsoft excel 應用程序的Implementation_in_Excel.xls文件可在此處獲得。（您還可以在數據選項卡 --> 分析 --> Slover 中查看優化選項。）
excel 和 python 中用於優化的數據是相同的，可以在此處找到（非常簡單，包含 501 行和 1 列）。
*如果你不能下載文件，讓我知道然后我會更新它們。

Answer 1

qw.Maximize()僅設置優化目標，您仍然需要在 model 上調用solve() 。

Answer 2

初始化將[0.7, 20]的值應用於每個參數。 應該使用標量來初始化value ，例如：

x = qw.Array(qw.Var , nd)
for i,xi in enumerate([0.7, 20]):
   x[i].value = xi

另一個問題是gekko需要使用特殊函數來為求解器執行自動微分。 對於目標 function，切換到求和的gekko版本：

qw.Maximize(qw.sum(obj_function(x)))

如果通過更改x的值來計算loss ，則目標 function 具有需要特殊處理的邏輯表達式，以便使用基於梯度的求解器進行求解。 嘗試將if3() function 用於條件語句或松弛變量（首選）。 目標 function 被評估一次以構建一個符號表達式，然后將其編譯為字節碼並使用其中一個求解器求解。 符號表達式位於m.path文件的gk0_model.apm中。

對編輯的回應

感謝您發布包含完整代碼的編輯。 這是一個潛在的解決方案：

from gekko import GEKKO
import numpy as np
import pandas as pd

loss = np.linspace(-555.795 , 477.841 , 500)
def obj_function(x):
    k,b = x
    target = []

    for iloss in loss:
        if iloss>160:
            t = qw.log((1/b)*(1+(k*(iloss-160)/b)**((-1/k)-1)))
            target.append(t)
    return target
qw = GEKKO(remote=False)
nd = 2
x = qw.Array(qw.Var,nd)
# initial values --> Kasi = 0.7 and Betaa = 20.0
for i,xi in enumerate([0.7, 20]):
   x[i].value = xi
# bounds
k,b = x
k.lower=0.6; k.upper=0.8
b.lower=10;  b.upper=30
qw.Maximize(qw.sum(obj_function(x)))
qw.options.SOLVER = 1
qw.solve()
print('k = ',k.value[0])
print('b = ',b.value[0])

求解器到達解的邊界。 可能需要擴大界限，這樣任意限制就不是解決方案。

更新

這是最終的解決方案。 代碼中的目標 function 有問題所以應該修復這是正確的腳本：

from gekko import GEKKO
import numpy as np
import pandas as pd

My_DataFrame = pd.read_excel("<FILE_PATH_IN_YOUR_MACHINE>\\Losses.xlsx")
loss = My_DataFrame["Loss"]

def obj_function(x):
    k,b = x
    q = ((-1/k)-1)
    target = []

    for iloss in loss:
        if iloss>160:
            t = qw.log(1/b) + q* ( qw.log(b+k*(iloss-160)) - qw.log(b))
            target.append(t)
    return target

qw = GEKKO(remote=False)
nd = 2
x = qw.Array(qw.Var,nd)

# initial values --> Kasi = 0.7 and Betaa = 20.0
for i,xi in enumerate([0.7, 20]):
   x[i].value = xi

qw.Maximize(qw.sum(obj_function(x)))
qw.solve()
print('Kasi = ',x[0].value)
print('Betaa = ',x[1].value)

Output：

 The final value of the objective function is  108.20609317143486
 
 ---------------------------------------------------
 Solver         :  IPOPT (v3.12)
 Solution time  :  0.031200000000000006 sec
 Objective      :  108.20609317143486
 Successful solution
 ---------------------------------------------------
 

Kasi =  [0.436245842]
Betaa =  [32.531632983]

結果接近 Microsoft Excel 的優化結果。

Answer 3

如果我沒看錯的話， My_DataFrame已經定義在全局scope中了。
問題是obj_funtion嘗試訪問它（成功）然后修改它的值（失敗）這是因為默認情況下您不能從本地 scope 修改全局變量。

使固定：

在obj_function的開頭，添加一行：

def obj_function(Array):
    # comments
    global My_DataFrame
    for item .... # remains same

這應該可以解決您的問題。

附加說明：

如果你只是想訪問My_DataFrame ，它會沒有任何錯誤地工作，你不需要添加global關鍵字

另外，只是想感謝您為此付出的努力。 有關於您想做什么的正確解釋、相關背景信息、出色的圖表（ Whiteboard也非常棒），甚至還有一個最小的工作示例。 這應該是所有 SO 問題的方式，它會讓每個人的生活更輕松

python 中帶有 gekko 的 MLE 應用程序

問題描述

3 個解決方案

解決方案1
2 2021-08-06 08:47:36

解決方案2
2 已采納 2021-08-18 13:33:27

解決方案3
1 2021-08-06 08:29:20

使固定：

附加說明：

python 中帶有 gekko 的 MLE 應用程序

問題描述

3 個解決方案

解決方案1 2 2021-08-06 08:47:36

解決方案2 2 已采納 2021-08-18 13:33:27

解決方案3 1 2021-08-06 08:29:20

使固定：

附加說明：

解決方案1
2 2021-08-06 08:47:36

解決方案2
2 已采納 2021-08-18 13:33:27

解決方案3
1 2021-08-06 08:29:20