如何使用 Gekko python 在优化中使用排序过程

Question

我正在尝试解决金融中的优化问题，以获得投资组合中每个股权的权重，从而最大化投资组合的预期收益并最小化投资组合的风险价值(VaR)。 我目前正在尝试使用一个目标 function 而不是多目标优化模型来 model 。 因此，这可以通过定义具有两个组件的最大化 function 来实现； 首先是投资组合的预期收益（系数为正（+1）），其次是投资组合的 VaR（系数为负（-1））。 所以 model 将是这样的：
最大化：expected_return - VaR

大约一年半前，我使用 Gekko 库为第一个组件（最大化：expected_return）编写了 Python 代码，现在我也在尝试包含第二个组件。 但我遇到的问题是：我必须对一个pd.Series进行排序，并在排序后的系列的特定索引处选择一个值来计算投资组合的 VaR； 但是由于这个系列的所有值都包含了 Gekko 变量，所以无法进行排序。 我想通过分享代码和问题来深入挖掘

首先，让我们看一下目标函数：

import pandas as pd
import numpy as np

def expected_return(x,Array):
    res = 0
    for j in range(len(Array)):
        res += x[j] * qw.log10(Array[j])
    return res



def VaRfunc(df:pd.DataFrame, weights:np.ndarray, initial_investment:int, alpha:float):
    """
    Calculating the Value at Risk of the portfolio with the given weights

    Params
    
    ------

        df: pd.DataFrame
            dataframe of adjusted close prices

        weights: np.ndarray
            given weights of each equity in the portfolio

        initial_investment: int
            initial investment

        alpha: float
            the confidence level for calculating the VaR of the portfolio

    Returns

    ------
        VaR: float
            value at risk of the portfolio
    
    """
    def ScenarioMaker(x):
        """
        Helper function for mapSenario

        Help

        ----

            x[0]: last close price

            x[1]: close price i 

            x[2]: close price i-1
        """
        if isinstance(x[0], float):
            return (x[2] * x[1])/x[0]
        else:
            pass

    def mapSenario(series:pd.Series):
        """
        This function Should be used inside the df.apply as
        function.

        params

        ------

            series: pd.Series
                each column of DataFrame

        Returns

        -------

            Calculated Scenarios for each column of the given DataFrame
            through df.apply
        """
        last_close = series.iloc[-1].repeat(len(series))
        return list(map(ScenarioMaker, zip(series.shift(), series , last_close)))

    def mapPortfolioValue(series, df, weights, initial_investment):
        """
        This function should be passed as a function of df.apply which df is the
        calculated scenarios dataframe through the `mapSenario` function.

        Params

        ------

            series: pd.Series
                each passed series through df.apply

            df: pd.DataFrame
                dataframe of adjusted Close Prices

            weights: np.array
                weights of each equity in the portfolio

            initial_investment: int
                initial investment

        
        Returns
        
        -------

            pd.Series of the portfolio value **for each given series!** 
        """
        idx = df.columns.tolist().index(series.name)
        last_close = df.iloc[-1,idx]
        return list(map(lambda x: (initial_investment*weights[idx]*x)/last_close, series))


    # Creating scenarios based on adj close price of each day
    scenarios = df.apply(mapSenario)

    # Calculating portfolio value based on created scenarios for each day
    portValue = (scenarios.apply(mapPortfolioValue, args=[df, weights, initial_investment])).sum(axis=1).iloc[1:].reset_index(drop=True)
    
    # Calculate loss of portfolio based on the weight of each stock in the portfolio
    # and subtracting the portfolio's value of each day from the initial investment
    # which is 10_000 and Sorting them in descending order.
    sortedLoss = (initial_investment-portValue).sort_values(ascending=False).reset_index(drop=True)

    if int((1-alpha) * len(sortedLoss)) == ((1-alpha) * len(sortedLoss)):
        VaR = sortedLoss[int((1-alpha) * len(sortedLoss)) - 1]
    else:
        VaR = (sortedLoss[int((1-alpha) * len(sortedLoss)) - 1] +  sortedLoss[int((1-alpha) * len(sortedLoss))])/2

    return VaR

第一个组件是expected_return function，第二个是VaRfunc 。 问题出现在第二个组件中，我正在尝试对名为sortedLoss的pd.Series object 进行排序。 要完成优化代码，我必须像这样定义我的 GEKKO model：

from gekko import GEKKO

predicted_returns = np.array([1.00046482, 1.0002218 , 1.00037122, 1.00089297, 1.00048257, 1.00040968, 1.00099596, 0.99995046, 0.9995493 , 1.00043154])
close_data = pd.read_csv("adj_close.csv" , index_col=0)

nd = 10
qw = GEKKO(remote=False)
x = qw.Array(qw.Var,nd,value=1/nd,lb=0,ub=1)
qw.Equation(sum(x) == 1)

qw.Maximize(expected_return(  x,
                            predicted_returns)
            
            - VaRfunc(close_data,
                        x,
                        10_000,
                        0.99))

qw.solve(disp=False)

运行代码后，我得到这个：

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
c:\Users\Shayan\Desktop\optimization_test.py in <cell line: 149>()
    146 x = qw.Array(qw.Var,nd,value=1/nd,lb=0,ub=1)
    147 qw.Equation(sum(x) == 1)
    149 qw.Maximize(Log_Caculator(  x,
    150                             predicted_returns)
    151             
--> 152                 - VaRfunc(close_data,
    153                           x,
    154                           10_000,
    155                           0.99))
    156 qw.solve(disp=False)

c:\Users\Shayan\Desktop\optimization_test.py in VaRfunc(df, weights, initial_investment, alpha)
    131 scenarios = df.apply(mapSenario)
    132 portValue = (scenarios.apply(mapPortfolioValue, args=[df, weights, initial_investment])).sum(axis=1).iloc[1:].reset_index(drop=True)
--> 133 sortedLoss = (initial_investment-portValue).sort_values(ascending=False).reset_index(drop=True)
    134 if int((1-alpha) * len(sortedLoss)) == ((1-alpha) * len(sortedLoss)):
    135     VaR = sortedLoss[int((1-alpha) * len(sortedLoss)) - 1]

File c:\Users\Shayan\Anaconda3\envs\Python3.10\lib\site-packages\pandas\util\_decorators.py:311, in deprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
    305 if len(args) > num_allow_args:
    306     warnings.warn(
    307         msg.format(arguments=arguments),
    308         FutureWarning,
    309         stacklevel=stacklevel,
    310     )
--> 311 return func(*args, **kwargs)

File c:\Users\Shayan\Anaconda3\envs\Python3.10\lib\site-packages\pandas\core\series.py:3526, in Series.sort_values(self, axis, ascending, inplace, kind, na_position, ignore_index, key)
   3524 # GH 35922. Make sorting stable by leveraging nargsort
   3525 values_to_sort = ensure_key_mapped(self, key)._values if key else self._values
-> 3526 sorted_index = nargsort(values_to_sort, kind, bool(ascending), na_position)
   3528 result = self._constructor(
   3529     self._values[sorted_index], index=self.index[sorted_index]
   3530 )
   3532 if ignore_index:

File c:\Users\Shayan\Anaconda3\envs\Python3.10\lib\site-packages\pandas\core\sorting.py:417, in nargsort(items, kind, ascending, na_position, key, mask)
    415     non_nans = non_nans[::-1]
    416     non_nan_idx = non_nan_idx[::-1]
--> 417 indexer = non_nan_idx[non_nans.argsort(kind=kind)]
    418 if not ascending:
    419     indexer = indexer[::-1]

File c:\Users\Shayan\Anaconda3\envs\Python3.10\lib\site-packages\gekko\gk_operators.py:25, in GK_Operators.__len__(self)
     24 def __len__(self):
---> 25     return len(self.value)

File c:\Users\Shayan\Anaconda3\envs\Python3.10\lib\site-packages\gekko\gk_operators.py:144, in GK_Value.__len__(self)
    143 def __len__(self):
--> 144     return len(self.value)

TypeError: object of type 'int' has no len()

我将adj_close.csv数据放在我的谷歌驱动器中； 随意使用给定的数据测试代码。 如您所见，对(initial_investment-portValue) （这是一个pd.Series对象）进行排序时存在问题，我猜问题的根源在于(initial_investment-portValue)包含 Gekko 变量的位置，这导致当我尝试对该系列进行排序时发生冲突，在此之后，我放弃了排序并试图获得该系列的最大值。 但是我遇到了同样的错误，我猜想获得最大值也使用了len function，所以我陷入了这个阶段。 我需要帮助

PS：除了adj_close.csv文件外，我把完整的代码放在上面提到的链接里，所以你不需要复制所有这些东西。

Answer 1

maximin （最大化最小数量）和minimax （最小化最大数量）在金融中常用来最大化最小收益或最小化最大损失。 处理这个的特殊方法不涉及排序。 Gekko 定义了一次问题并且不会更改程序结构，因此需要使用二进制变量来选择最大值的索引。 优化课程中有一个网页提供了有关minimax和maximin的更多详细信息。 它可能有助于避免通过添加二进制变量来解决问题的困难。

这是一个例子：

 min max(x1,x2,x3)
 subject to x1 + x2 + x3 = 15

这是通过添加一个新变量Z来转换的，该变量是x1 、 x2和x3的上限：

 min Z
 s.t. x1 + x2 + x3 = 15
      Z >= x1
      Z >= x2
      Z >= x3

它需要一些额外的不等式约束和一个新变量，但它比使用m.max3() function 使用的二进制变量求解要高效得多。

from gekko import GEKKO
m = GEKKO(remote=False)
x1,x2,x3,Z = m.Array(m.Var,4)
m.Minimize(Z)
m.Equation(x1+x2+x3==15)
m.Equations([Z>=x1,Z>=x2,Z>=x3])
m.solve()
print('x1: ',x1.value[0])
print('x2: ',x2.value[0])
print('x3: ',x3.value[0])
print('Z:  ',Z.value[0])

解决方案：

x1:  5.0
x2:  5.0
x3:  5.0
Z:   4.9999999901

我建议尝试一下，看看它是否有帮助。

如何使用 Gekko python 在优化中使用排序过程

问题描述

1 个解决方案

解决方案1
0 2022-07-29 02:53:44

如何使用 Gekko python 在优化中使用排序过程

问题描述

1 个解决方案

解决方案1 0 2022-07-29 02:53:44

解决方案1
0 2022-07-29 02:53:44