如何加速熊貓 df.apply（可能使用 np.where 或 np_logical_and.reduce？）

Question

我希望加快在 Pandas 數據框中生成新列的速度，每行執行 myFunc() 的代碼：

df = pd.DataFrame(data, columns=["EMA4","EMA4prior","EMA10","MACD"])

    def myFunc (self, row):
           if ((row.EMA4 > row.EMA10) and (row.EMA4prior < row.EMA10) and (row.MACD > 0)):
            return 0
           if ((row.EMA4 < row.EMA10) and (row.EMA4prior > row.EMA10) and (row.MACD < 0)):
            return 1
        return -1

self.df["position"] = self.df.apply(self.myFunc, axis=1) #apply this per each row

該代碼有效，但速度非常慢。 我嘗試了以下方法來改進它，但語法中的某些內容似乎是錯誤的：

1.- 直接使用 numpy.where：

a=self.df["EMA4"].values
b=self.df["EMA4prior"].values
c=self.df["EMA10"].values
d=self.df["MACD"].values
self.df["position"] = np.where(((a > c)&(b < c)&(e > 0)),0, 
                       (np.where((a < c)&(b > c)&(d < 0)), 1, -1))

2.- 使用 np.logical_and.reduce ，因為np.logical_and似乎是二元運算符（我有 3 個“和”要計算）：

self.df["position"] = np.where(np.logical_and.reduce([(a > c),(b < c),(e > 0)]),0,
                        (np.where(np.logical_and.reduce[(a < c),(b > c),(e < 0)]), 1, -1))

我沒有讓它工作，它沒有編譯，我不確定出了什么問題。

那么，有沒有辦法用 numpy 或其他一些方法來加速原始 myFunc() 以提高性能？

Answer 1

IIUC，你應該可以使用np.select ：

cond = [(df.EMA4 > df.EMA10) & (df.EMA4prior < df.EMA10) & (df.MACD > 0), 
        (df.EMA4 < df.EMA10) & (df.EMA4prior > df.EMA10) & (df.MACD < 0)]
result = [0,1]

df['position'] = np.select(cond, result, -1)

如何加速熊貓 df.apply（可能使用 np.where 或 np_logical_and.reduce？）

問題描述

1 個解決方案

解決方案1
2 已采納 2017-11-01 16:37:42

如何加速熊貓 df.apply（可能使用 np.where 或 np_logical_and.reduce？）

問題描述

1 個解決方案

解決方案1 2 已采納 2017-11-01 16:37:42

解決方案1
2 已采納 2017-11-01 16:37:42