![](/img/trans.png)
[英]How to iterate over two dataframe columns and add values from a list based on the values in those two columns
[英]How to iterate through two columns in a pandas dataframe to add the values to a list
我正在嘗試評估一個 pandas 列中的條件,並根據條件從另一個 pandas 列和 append 中獲取值到列表中。
我嘗試了以下方法:
def roc_table(df, row_count, signal, returns):
"""
Parameters
----------
df : dataframe
row_count : length of data
signal : signal/s
returns : log returns
Returns
-------
table - hopefully
"""
df = df.copy()
bins = [-48.13,-38.70, -29.28, -19.85, -10.42, -1.01,
8.42, 17.85, 27.27, 36.7]
win_above = 0
lose_above = 0
lose_below = 0
win_below = 0
# df = df.sort_values([signal, returns])
for bin in bins:
k = bin
for row, value in df.iterrows():
if row[signal] < k:
lose_below += row[returns]
else:
win_below -= row[returns]
for row, value in df.iterrows():
if row[signal] >= k:
win_above += row[returns]
else:
lose_above -= row[returns]
print(win_above, lose_above, lose_below, win_below)
roc_table(df = df_train, row_count = df_train.shape[0],
signal = 'predicted_RSI_indicator',
returns = 'log_return')
但我只得到
Traceback (most recent call last):
File "<ipython-input-135-cd5513bb0778>", line 50, in <module>
roc_table(df = df_train, row_count = df_train.shape[0],
File "<ipython-input-135-cd5513bb0778>", line 32, in roc_table
if row[signal] < k:
TypeError: 'Timestamp' object is not subscriptable
索引是日期時間戳。
這是輸入df的示例
signal returns
-.23 .045
2.3 -.09
9.8 1.2
output 看起來像這樣
bins win_above lose_above win_below lose_below
-48.13 123
-38.70 -98
-29.28 100
-19.85 -34
-10.42 567
...
所以想法是,如果df[singal]
低於 bin,則相關的返回值(如果大於 0)被添加到 win_below,否則它被添加到 loss_below。
我最終會為那些大於 bin 的信號添加一個循環,並將它們添加到 win_above、loose_above。
根據 Pandas 文檔, pandas.DataFrame.iterrows產生“行的索引和行的數據作為系列” 。
所以,你應該這樣做(在你的 for 循環中兩次):
for i, row in df.iterrows():
...
代替:
for row, value in df.iterrows():
...
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.