简体   繁体   中英

Why my user define function returns only first group's return when executing groupby.apply?

Summary

When groupby is used, the result is as follows. 在此处输入图像描述

The value of 2.19 is the user-defined function's return value of the first group. That is, when the function is implemented for mulCut[(mulCut['date'] == '2018-03-05') & (mulCut['moneyness'] == 'atm')], I get 2.19.

Explanation

I'm trying to get different returns by different groups by using.groupby.apply(). In my case, groups are split by two variables 'date' and 'moneyness' as below. As you can see DataFrame below, 'date' contains four categorical groups 'atm', 'itm', 'otm' and 'tot'. 在此处输入图像描述

And my user-defined function is as follows. The function calculates the return from trading kospi index between 9:05 and 14:50. Briefly, trading strategy is buying or selling kospi index according to signal. '>= sensitivity' is buy signal and '<= 1/sensitivity' is sell signal. Since I assume that I can sell or buy all my budget for each signal, when short selling occurred already, sell signal is ignored. Similary, if I bought kospi index already, buy signal is ignored. Lastly, at last minute (14:50), trade must be liquidated. That is, if my status is short selling in 14:49, I must buy kospi200 no matter what signal I receive in 14:50. Similarly, if my status is buying in 14:49, I must sell kospi200.

def get_onedayRt(onedayDf, timeVrbl, cpVrbl, kospiVrbl, sensitivity):
    onedayDf['action'] = np.nan
    state = 0 # 0: can buy or short sell, 1: can only sell, -1: can only buy
    value = 0 # return of simulation
    targetDf = onedayDf.sort_values(timeVrbl)
    targetDf = targetDf.reset_index(drop = True)
    lastidx = len(onedayDf) - 1

    for idx, timeData in targetDf.iterrows():
        if timeData[cpVrbl] >= sensitivity:
            if state == -1:
                state += 1
                targetDf.loc[idx, 'action'] = 1 #buy
                value -= timeData[kospiVrbl]

            elif state == 0:
                state += 1
                targetDf.loc[idx, 'action'] = 1
                value -= timeData[kospiVrbl]

        elif timeData[cpVrbl] <= 1/sensitivity:
            if state == 1:
                state -= 1
                targetDf.loc[idx, 'action'] = -1 # sell
                value += timeData[kospiVrbl]

            elif state == 0:
                state -= 1
                targetDf.loc[idx, 'action'] = -1
                value += timeData[kospiVrbl]

        if lastidx - 1 == idx:
            break # last action needs to be determied as below

    if state == -1:
        targetDf.loc[lastidx, 'action'] = 1
        value -= targetDf.loc[lastidx, kospiVrbl]
    elif state == 1:
        targetDf.loc[lastidx, 'action'] = -1
        value += targetDf.loc[lastidx, kospiVrbl]

    return value

I found that my function works appropriately for each specific group. That is, the code below works. I could get 2.97 which I wanted to get.

tmp = mulCut[(mulCut['date'] == '2018-03-05') & (mulCut['moneyness'] == 'tot')]
get_onedayRt(tmp, 'time', 'call/put', 'kospi200', 1)

Therefore, I wonder why my user define function returns only first group's return when executing groupby.apply? And how can I edit my code to fix the problem?

Thank you for reading my long question.

I solved my problem finally... the first line of my function was the source of my problem. After the line is deleted, my code works properly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM