如何使用 Gekko python 在優化中使用排序過程

Question

我正在嘗試解決金融中的優化問題，以獲得投資組合中每個股權的權重，從而最大化投資組合的預期收益並最小化投資組合的風險價值(VaR)。 我目前正在嘗試使用一個目標 function 而不是多目標優化模型來 model 。 因此，這可以通過定義具有兩個組件的最大化 function 來實現； 首先是投資組合的預期收益（系數為正（+1）），其次是投資組合的 VaR（系數為負（-1））。 所以 model 將是這樣的：
最大化：expected_return - VaR

大約一年半前，我使用 Gekko 庫為第一個組件（最大化：expected_return）編寫了 Python 代碼，現在我也在嘗試包含第二個組件。 但我遇到的問題是：我必須對一個pd.Series進行排序，並在排序后的系列的特定索引處選擇一個值來計算投資組合的 VaR； 但是由於這個系列的所有值都包含了 Gekko 變量，所以無法進行排序。 我想通過分享代碼和問題來深入挖掘

首先，讓我們看一下目標函數：

import pandas as pd
import numpy as np

def expected_return(x,Array):
    res = 0
    for j in range(len(Array)):
        res += x[j] * qw.log10(Array[j])
    return res



def VaRfunc(df:pd.DataFrame, weights:np.ndarray, initial_investment:int, alpha:float):
    """
    Calculating the Value at Risk of the portfolio with the given weights

    Params
    
    ------

        df: pd.DataFrame
            dataframe of adjusted close prices

        weights: np.ndarray
            given weights of each equity in the portfolio

        initial_investment: int
            initial investment

        alpha: float
            the confidence level for calculating the VaR of the portfolio

    Returns

    ------
        VaR: float
            value at risk of the portfolio
    
    """
    def ScenarioMaker(x):
        """
        Helper function for mapSenario

        Help

        ----

            x[0]: last close price

            x[1]: close price i 

            x[2]: close price i-1
        """
        if isinstance(x[0], float):
            return (x[2] * x[1])/x[0]
        else:
            pass

    def mapSenario(series:pd.Series):
        """
        This function Should be used inside the df.apply as
        function.

        params

        ------

            series: pd.Series
                each column of DataFrame

        Returns

        -------

            Calculated Scenarios for each column of the given DataFrame
            through df.apply
        """
        last_close = series.iloc[-1].repeat(len(series))
        return list(map(ScenarioMaker, zip(series.shift(), series , last_close)))

    def mapPortfolioValue(series, df, weights, initial_investment):
        """
        This function should be passed as a function of df.apply which df is the
        calculated scenarios dataframe through the `mapSenario` function.

        Params

        ------

            series: pd.Series
                each passed series through df.apply

            df: pd.DataFrame
                dataframe of adjusted Close Prices

            weights: np.array
                weights of each equity in the portfolio

            initial_investment: int
                initial investment

        
        Returns
        
        -------

            pd.Series of the portfolio value **for each given series!** 
        """
        idx = df.columns.tolist().index(series.name)
        last_close = df.iloc[-1,idx]
        return list(map(lambda x: (initial_investment*weights[idx]*x)/last_close, series))


    # Creating scenarios based on adj close price of each day
    scenarios = df.apply(mapSenario)

    # Calculating portfolio value based on created scenarios for each day
    portValue = (scenarios.apply(mapPortfolioValue, args=[df, weights, initial_investment])).sum(axis=1).iloc[1:].reset_index(drop=True)
    
    # Calculate loss of portfolio based on the weight of each stock in the portfolio
    # and subtracting the portfolio's value of each day from the initial investment
    # which is 10_000 and Sorting them in descending order.
    sortedLoss = (initial_investment-portValue).sort_values(ascending=False).reset_index(drop=True)

    if int((1-alpha) * len(sortedLoss)) == ((1-alpha) * len(sortedLoss)):
        VaR = sortedLoss[int((1-alpha) * len(sortedLoss)) - 1]
    else:
        VaR = (sortedLoss[int((1-alpha) * len(sortedLoss)) - 1] +  sortedLoss[int((1-alpha) * len(sortedLoss))])/2

    return VaR

第一個組件是expected_return function，第二個是VaRfunc 。 問題出現在第二個組件中，我正在嘗試對名為sortedLoss的pd.Series object 進行排序。 要完成優化代碼，我必須像這樣定義我的 GEKKO model：

from gekko import GEKKO

predicted_returns = np.array([1.00046482, 1.0002218 , 1.00037122, 1.00089297, 1.00048257, 1.00040968, 1.00099596, 0.99995046, 0.9995493 , 1.00043154])
close_data = pd.read_csv("adj_close.csv" , index_col=0)

nd = 10
qw = GEKKO(remote=False)
x = qw.Array(qw.Var,nd,value=1/nd,lb=0,ub=1)
qw.Equation(sum(x) == 1)

qw.Maximize(expected_return(  x,
                            predicted_returns)
            
            - VaRfunc(close_data,
                        x,
                        10_000,
                        0.99))

qw.solve(disp=False)

運行代碼后，我得到這個：

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
c:\Users\Shayan\Desktop\optimization_test.py in <cell line: 149>()
    146 x = qw.Array(qw.Var,nd,value=1/nd,lb=0,ub=1)
    147 qw.Equation(sum(x) == 1)
    149 qw.Maximize(Log_Caculator(  x,
    150                             predicted_returns)
    151             
--> 152                 - VaRfunc(close_data,
    153                           x,
    154                           10_000,
    155                           0.99))
    156 qw.solve(disp=False)

c:\Users\Shayan\Desktop\optimization_test.py in VaRfunc(df, weights, initial_investment, alpha)
    131 scenarios = df.apply(mapSenario)
    132 portValue = (scenarios.apply(mapPortfolioValue, args=[df, weights, initial_investment])).sum(axis=1).iloc[1:].reset_index(drop=True)
--> 133 sortedLoss = (initial_investment-portValue).sort_values(ascending=False).reset_index(drop=True)
    134 if int((1-alpha) * len(sortedLoss)) == ((1-alpha) * len(sortedLoss)):
    135     VaR = sortedLoss[int((1-alpha) * len(sortedLoss)) - 1]

File c:\Users\Shayan\Anaconda3\envs\Python3.10\lib\site-packages\pandas\util\_decorators.py:311, in deprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
    305 if len(args) > num_allow_args:
    306     warnings.warn(
    307         msg.format(arguments=arguments),
    308         FutureWarning,
    309         stacklevel=stacklevel,
    310     )
--> 311 return func(*args, **kwargs)

File c:\Users\Shayan\Anaconda3\envs\Python3.10\lib\site-packages\pandas\core\series.py:3526, in Series.sort_values(self, axis, ascending, inplace, kind, na_position, ignore_index, key)
   3524 # GH 35922. Make sorting stable by leveraging nargsort
   3525 values_to_sort = ensure_key_mapped(self, key)._values if key else self._values
-> 3526 sorted_index = nargsort(values_to_sort, kind, bool(ascending), na_position)
   3528 result = self._constructor(
   3529     self._values[sorted_index], index=self.index[sorted_index]
   3530 )
   3532 if ignore_index:

File c:\Users\Shayan\Anaconda3\envs\Python3.10\lib\site-packages\pandas\core\sorting.py:417, in nargsort(items, kind, ascending, na_position, key, mask)
    415     non_nans = non_nans[::-1]
    416     non_nan_idx = non_nan_idx[::-1]
--> 417 indexer = non_nan_idx[non_nans.argsort(kind=kind)]
    418 if not ascending:
    419     indexer = indexer[::-1]

File c:\Users\Shayan\Anaconda3\envs\Python3.10\lib\site-packages\gekko\gk_operators.py:25, in GK_Operators.__len__(self)
     24 def __len__(self):
---> 25     return len(self.value)

File c:\Users\Shayan\Anaconda3\envs\Python3.10\lib\site-packages\gekko\gk_operators.py:144, in GK_Value.__len__(self)
    143 def __len__(self):
--> 144     return len(self.value)

TypeError: object of type 'int' has no len()

我將adj_close.csv數據放在我的谷歌驅動器中； 隨意使用給定的數據測試代碼。 如您所見，對(initial_investment-portValue) （這是一個pd.Series對象）進行排序時存在問題，我猜問題的根源在於(initial_investment-portValue)包含 Gekko 變量的位置，這導致當我嘗試對該系列進行排序時發生沖突，在此之后，我放棄了排序並試圖獲得該系列的最大值。 但是我遇到了同樣的錯誤，我猜想獲得最大值也使用了len function，所以我陷入了這個階段。 我需要幫助

PS：除了adj_close.csv文件外，我把完整的代碼放在上面提到的鏈接里，所以你不需要復制所有這些東西。

Answer 1

maximin （最大化最小數量）和minimax （最小化最大數量）在金融中常用來最大化最小收益或最小化最大損失。 處理這個的特殊方法不涉及排序。 Gekko 定義了一次問題並且不會更改程序結構，因此需要使用二進制變量來選擇最大值的索引。 優化課程中有一個網頁提供了有關minimax和maximin的更多詳細信息。 它可能有助於避免通過添加二進制變量來解決問題的困難。

這是一個例子：

 min max(x1,x2,x3)
 subject to x1 + x2 + x3 = 15

這是通過添加一個新變量Z來轉換的，該變量是x1 、 x2和x3的上限：

 min Z
 s.t. x1 + x2 + x3 = 15
      Z >= x1
      Z >= x2
      Z >= x3

它需要一些額外的不等式約束和一個新變量，但它比使用m.max3() function 使用的二進制變量求解要高效得多。

from gekko import GEKKO
m = GEKKO(remote=False)
x1,x2,x3,Z = m.Array(m.Var,4)
m.Minimize(Z)
m.Equation(x1+x2+x3==15)
m.Equations([Z>=x1,Z>=x2,Z>=x3])
m.solve()
print('x1: ',x1.value[0])
print('x2: ',x2.value[0])
print('x3: ',x3.value[0])
print('Z:  ',Z.value[0])

解決方案：

x1:  5.0
x2:  5.0
x3:  5.0
Z:   4.9999999901

我建議嘗試一下，看看它是否有幫助。

如何使用 Gekko python 在優化中使用排序過程

問題描述

1 個解決方案

解決方案1
0 2022-07-29 02:53:44

如何使用 Gekko python 在優化中使用排序過程

問題描述

1 個解決方案

解決方案1 0 2022-07-29 02:53:44

解決方案1
0 2022-07-29 02:53:44