繁体   English   中英

使用相邻行计算 Pandas Dataframe 中的列而不遍历每一行

[英]Calculate column in Pandas Dataframe using adjacent rows without iterating through each row

我想看看是否有一种方法可以计算数据框中的列,该列使用类似于移动平均线的东西,而无需遍历每一行。 当前工作代码:

def create_candles(ticks, instrument, time_slice):
    candlesticks = ticks.price.resample(time_slice, base=00).ohlc().bfill()
    volume = ticks.amount.resample(time_slice, base=00).sum()
    candlesticks['volume'] = volume
    candlesticks['instrument'] = instrument
    candlesticks['ttr'] = 0
    # candlesticks['vr_7'] = 0
    candlesticks['vr_10'] = 0
    candlesticks = calculate_indicators(candlesticks, instrument, time_slice)

    return candlesticks


def calculate_indicators(candlesticks, instrument):
    candlesticks.sort_index(inplace=True)
    # candlesticks['rsi_14'] = talib.RSI(candlesticks.close, timeperiod=14)
    candlesticks['lr_50'] = talib.LINEARREG(candlesticks.close, timeperiod=50)
    # candlesticks['lr_150'] = talib.LINEARREG(candlesticks.close, timeperiod=150)
    # candlesticks['ema_55'] = talib.EMA(candlesticks.close, timeperiod=55)
    # candlesticks['ema_28'] = talib.EMA(candlesticks.close, timeperiod=28)
    # candlesticks['ema_18'] = talib.EMA(candlesticks.close, timeperiod=18)
    # candlesticks['ema_9'] = talib.EMA(candlesticks.close, timeperiod=9)
    # candlesticks['wma_21'] = talib.WMA(candlesticks.close, timeperiod=21)
    # candlesticks['wma_12'] = talib.WMA(candlesticks.close, timeperiod=12)
    # candlesticks['wma_11'] = talib.WMA(candlesticks.close, timeperiod=11)
    # candlesticks['wma_5'] = talib.WMA(candlesticks.close, timeperiod=5)
    candlesticks['cmo_9'] = talib.CMO(candlesticks.close, timeperiod=9)

    for row in candlesticks.itertuples():
        current_index = candlesticks.index.get_loc(row.Index)
        if current_index >= 1:
            previous_close = candlesticks.iloc[current_index - 1, candlesticks.columns.get_loc('close')]
            candlesticks.iloc[current_index, candlesticks.columns.get_loc('ttr')] = max(
                row.high - row.low,
                abs(row.high - previous_close),
                abs(row.low - previous_close))

        if current_index > 10:
            candlesticks.iloc[current_index, candlesticks.columns.get_loc('vr_10')] = candlesticks.iloc[current_index, candlesticks.columns.get_loc('ttr')] / (
                max(candlesticks.high[current_index - 9: current_index].max(), candlesticks.close[current_index - 11]) -
                min(candlesticks.low[current_index - 9: current_index].min(), candlesticks.close[current_index - 11]))

    candlesticks['timestamp'] = pd.to_datetime(candlesticks.index)
    candlesticks['instrument'] = instrument
    candlesticks.fillna(0, inplace=True)
    return candlesticks

在迭代中,我正在计算真实范围('TTR'),然后计算波动率('VR_10')

TTR 是在 DF 中的每一行上计算的,除了第一行。 它使用前一行的关闭列,以及当前行的高低列。

VR_10 除前 10 行之外的每一行都计算。它使用前 9 行的高低列和第 10 行的收盘价。

编辑 2我已经尝试了很多方法来在这个问题中添加基于文本的数据框,但似乎没有解决我的框架宽度的解决方案。 除了列 TTR 和 VR_10 在输入中全为 0 并且在输出中具有非零值之外,输入和输出数据帧没有区别。 一个例子是这个数据框: 烛台数据框

有没有办法不用迭代就可以做到这一点?

随着 Andreas 推动使用滚动,我得出了一个答案:首先,我必须找出如何对多列使用滚动。 发现这里 我做了一个修改,因为我需要卷起,而不是向下

def roll(df, w, **kwargs):
    df.sort_values(by='timestamp', ascending=0, inplace=True)
    v = df.values
    d0, d1 = v.shape
    s0, s1 = v.strides

    a = stride(v, (d0 - (w - 1), w, d1), (s0, s0, s1))

    rolled_df = pd.concat({
        row: pd.DataFrame(values, columns=df.columns)
        for row, values in zip(df.index, a)
    })

    return rolled_df.groupby(level=0, **kwargs)

之后,我创建了两个函数:

def calculate_vr(window):
    return window.iloc[0].ttr / (max(window.high[1:9].max(), window.iloc[10].close) - min(window.low[1:9].min(), window.iloc[10].close))


def calculate_ttr(window):
    return max(window.iloc[0].high - window.iloc[0].low, abs(window.iloc[0].high - window.iloc[1].close), abs(window.iloc[0].low - window.iloc[1].close))

并像这样调用这些函数:

    candlesticks['ttr'] = roll(candlesticks, 3).apply(calculate_ttr)
    candlesticks['vr_10'] = roll(candlesticks, 11).apply(calculate_vr)

向两种方式都添加了计时器,这种方式大约比迭代慢 3 倍。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM