如何遍歷 pandas dataframe 中的兩列以將值添加到列表中

Question

我正在嘗試評估一個 pandas 列中的條件，並根據條件從另一個 pandas 列和 append 中獲取值到列表中。

我嘗試了以下方法：

    def roc_table(df, row_count, signal, returns):
    """
    

    Parameters
    ----------
    df : dataframe
    row_count : length of data
    signal : signal/s
    returns : log returns

    Returns
    -------
    table - hopefully

    """
    df = df.copy()
    
    bins = [-48.13,-38.70, -29.28, -19.85, -10.42, -1.01,
            8.42, 17.85, 27.27, 36.7]
    
    win_above = 0
    lose_above = 0
    lose_below = 0
    win_below = 0
    
    # df = df.sort_values([signal, returns])
     
    for bin in bins:
        k = bin
        for row, value in df.iterrows():
            if row[signal] < k:
                lose_below += row[returns]
            else:
                win_below -= row[returns]
        for row, value in df.iterrows():
            if row[signal] >= k:
                win_above += row[returns]
            else: 
                lose_above -= row[returns]
                
    print(win_above, lose_above, lose_below, win_below)
            
roc_table(df = df_train, row_count = df_train.shape[0],
          signal = 'predicted_RSI_indicator',
          returns = 'log_return')

但我只得到

Traceback (most recent call last):

  File "<ipython-input-135-cd5513bb0778>", line 50, in <module>
    roc_table(df = df_train, row_count = df_train.shape[0],

  File "<ipython-input-135-cd5513bb0778>", line 32, in roc_table
    if row[signal] < k:

TypeError: 'Timestamp' object is not subscriptable

索引是日期時間戳。

這是輸入df的示例

signal   returns
-.23      .045
2.3      -.09
9.8       1.2

output 看起來像這樣

bins      win_above   lose_above   win_below   lose_below
-48.13    123
-38.70    -98
-29.28    100
-19.85    -34 
-10.42     567
...

所以想法是，如果df[singal]低於 bin，則相關的返回值（如果大於 0）被添加到 win_below，否則它被添加到 loss_below。

我最終會為那些大於 bin 的信號添加一個循環，並將它們添加到 win_above、loose_above。

Answer 1

根據 Pandas 文檔， pandas.DataFrame.iterrows產生“行的索引和行的數據作為系列” 。

所以，你應該這樣做（在你的 for 循環中兩次）：

for i, row in df.iterrows():
    ...

代替：

for row, value in df.iterrows():
    ...

如何遍歷 pandas dataframe 中的兩列以將值添加到列表中

問題描述

1 個解決方案

解決方案1
0 2021-05-12 17:21:54

如何遍歷 pandas dataframe 中的兩列以將值添加到列表中

問題描述

1 個解決方案

解決方案1 0 2021-05-12 17:21:54

解決方案1
0 2021-05-12 17:21:54