简体   繁体   English

Pandas 矢量化加速 dataframe function

[英]Pandas Vectorization speed up dataframe function

I have this python code of the supertrend implementation.我有这个超级趋势实现的 python 代码。 i am using pandas dataframe.我正在使用 pandas dataframe。 the code works fine but, the supertrend function runs slower and slower as the dataframe increases in length.代码工作正常,但是随着 dataframe 长度的增加,超级趋势 function 运行速度越来越慢。 i was wondering how i could convert the for loop in the supertrend function to a Pandas Vectorization or using the apply() method我想知道如何将超级趋势 function 中的 for 循环转换为 Pandas 向量化或使用 apply() 方法

def trueRange(df):
    df['prevClose'] = df['close'].shift(1)
    df['high-low'] = df['high'] - df['low']
    df['high-pClose'] = abs(df['high'] - df['prevClose'])
    df['low-pClose'] = abs(df['low'] - df['prevClose'])
    tr = df[['high-low','high-pClose','low-pClose']].max(axis=1)
    
    return tr

def averageTrueRange(df, peroid=12):
    df['trueRange'] = trueRange(df)
    the_atr = df['trueRange'].rolling(peroid).mean()
    
    return the_atr
    

def superTrend(df, peroid=5, multipler=1.5):
    df['averageTrueRange'] = averageTrueRange(df, peroid=peroid)
    h2 = ((df['high'] + df['low']) / 2)
    df['Upperband'] = h2 + (multipler * df['averageTrueRange'])
    df['Lowerband'] = h2 - (multipler * df['averageTrueRange'])
    df['inUptrend'] = None

    for current in range(1,len(df.index)):
        prev = current- 1
        
        if df['close'][current] > df['Upperband'][prev]:
            df['inUptrend'].iloc[current] = True
            
        elif df['close'][current] < df['Lowerband'][prev]:
            df['inUptrend'].iloc[current] = False
        else:
            df['inUptrend'].iloc[current] = df['inUptrend'][prev]
            
            if df['inUptrend'][current] and df['Lowerband'][current] < df['Lowerband'][prev]:
                df['Lowerband'].iloc[current] = df['Lowerband'][prev]
                
            if not df['inUptrend'][current] and df['Upperband'][current] > df['Upperband'][prev]:
                df['Upperband'].iloc[current] = df['Upperband'][prev]

数据示例

Use .values[1:] and .values[:-1] for the vectorized comparison.使用.values[1:].values[:-1]进行矢量化比较。
That is, .values[1:] is current , .values[:-1] is prev in your code.也就是说, .values[1:]current.values[:-1]在您的代码中是prev

Here is example to convert IF statements into vectorized comparison.这是将 IF 语句转换为向量化比较的示例。

cond1 = df['close'].values[1:] > df['Upperband'].values[:-1]
cond1 = np.insert(cond1, 0, False)
df.loc[cond1, 'inUptrend'] = True

The reason using insert is the 0'th element has no element to be compared with.使用插入的原因是第 0 个元素没有可比较的元素。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM