簡體   English   中英

Python - 在數據框中運行 for 循環的更快方法

[英]Python - faster way to run a for loop in a dataframe

我正在運行以下代碼來為每個數據框行計算前幾行中的正天數以及股票擊敗標准普爾 500 指數的天數:

        for offset in [1,5,15,30,45,60,75,90,120,150,
                       200,250,500,750,1000,1250,1500]:
            asset['return_stock'] = (asset.Close - asset.Close.shift(1)) / (asset.Close.shift(1))
            merged_data = pd.merge(asset, sp_500, on='Date')
            total_positive_days=0
            total_beating_sp_days=0
            for index, row in merged_data.iterrows():
                print(offset, index)
                for i in range(0,offset):
                    if index-i-1>0:
                        if merged_data.loc[index-i,'Close_x'] > merged_data.loc[index-i-1,'Close_x']:
                            total_positive_days+=1
                        if merged_data.loc[index-i,'return_stock'] > merged_data.loc[index-i-1,'return_sp']:
                            total_beating_sp_days+=1

但它很慢。 有沒有辦法加快速度(可能通過某種方式擺脫 for 循環)?

我的數據集如下所示(合並數據如下):

Date     Open_x     High_x      Low_x    Close_x  Adj Close_x   Volume_x  return_stock  Pct_positive_1  Pct_beating_1  Pct_change_1  Pct_change_plus_1  Pct_positive_5  Pct_beating_5  Pct_change_5  Pct_change_plus_5  Pct_positive_15  Pct_beating_15  Pct_change_15  Pct_change_plus_15  Pct_positive_30  Pct_beating_30  Pct_change_30  Pct_change_plus_30       Open_y       High_y        Low_y      Close_y  Adj Close_y    Volume_y  return_sp
0  2010-01-04  30.490000  30.642857  30.340000  30.572857    26.601469  123432400           NaN          1311.0         1261.0           NaN          -0.001726          1310.4         1260.8           NaN           0.018562           1307.2          1257.6            NaN            0.039186      1302.066667     1252.633333            NaN            0.056579  1116.560059  1133.869995  1116.560059  1132.989990  1132.989990  3991400000   0.016043
1  2010-01-05  30.657143  30.798571  30.464285  30.625713    26.647457  150476200      0.001729          1311.0         1261.0      0.001729           0.016163          1310.4         1260.8           NaN           0.032062           1307.2          1257.6            NaN            0.031268      1302.066667     1252.633333            NaN            0.056423  1132.660034  1136.630005  1129.660034  1136.520020  1136.520020  2491020000   0.003116
2  2010-01-06  30.625713  30.747143  30.107143  30.138571    26.223597  138040000     -0.015906          1311.0         1261.0     -0.015906           0.001852          1310.4         1260.8           NaN           0.001519           1307.2          1257.6            NaN            0.058608      1302.066667     1252.633333            NaN            0.046115  1135.709961  1139.189941  1133.949951  1137.140015  1137.140015  4972660000   0.000546
3  2010-01-07  30.250000  30.285715  29.864286  30.082857    26.175119  119282800     -0.001849          1311.0         1261.0     -0.001849          -0.006604          1310.4         1260.8           NaN           0.005491           1307.2          1257.6            NaN            0.096428      1302.066667     1252.633333            NaN            0.050694  1136.270020  1142.459961  1131.319946  1141.689941  1141.689941  5270680000   0.004001
4  2010-01-08  30.042856  30.285715  29.865715  30.282858    26.349140  111902700      0.006648          1311.0         1261.0      0.006648           0.008900          1310.4         1260.8           NaN           0.029379           1307.2          1257.6            NaN            0.088584      1302.066667     1252.633333            NaN            0.075713  1140.520020  1145.390015  1136.219971  1144.979980  1144.979980  4389590000   0.002882

資產如下:

         Date       Open       High        Low      Close  Adj Close     Volume  return_stock  Pct_positive_1  Pct_beating_1  Pct_change_1  Pct_change_plus_1  Pct_positive_5  Pct_beating_5  Pct_change_5  Pct_change_plus_5
0  2010-01-04  30.490000  30.642857  30.340000  30.572857  26.601469  123432400           NaN          1311.0         1261.0           NaN          -0.001726          1310.4         1260.8           NaN           0.018562
1  2010-01-05  30.657143  30.798571  30.464285  30.625713  26.647457  150476200      0.001729          1311.0         1261.0      0.001729           0.016163          1310.4         1260.8           NaN           0.032062
2  2010-01-06  30.625713  30.747143  30.107143  30.138571  26.223597  138040000     -0.015906          1311.0         1261.0     -0.015906           0.001852          1310.4         1260.8           NaN           0.001519
3  2010-01-07  30.250000  30.285715  29.864286  30.082857  26.175119  119282800     -0.001849          1311.0         1261.0     -0.001849          -0.006604          1310.4         1260.8           NaN           0.005491
4  2010-01-08  30.042856  30.285715  29.865715  30.282858  26.349140  111902700      0.006648          1311.0         1261.0      0.006648           0.008900          1310.4         1260.8           NaN           0.029379

sp_500 如下:

         Date         Open         High          Low        Close    Adj Close      Volume  return_sp
0  1999-12-31  1464.469971  1472.420044  1458.189941  1469.250000  1469.250000   374050000        NaN
1  2000-01-03  1469.250000  1478.000000  1438.359985  1455.219971  1455.219971   931800000  -0.009549
2  2000-01-04  1455.219971  1455.219971  1397.430054  1399.420044  1399.420044  1009000000  -0.038345
3  2000-01-05  1399.420044  1413.270020  1377.680054  1402.109985  1402.109985  1085500000   0.001922
4  2000-01-06  1402.109985  1411.900024  1392.099976  1403.449951  1403.449951  1092300000   0.000956

這是部分答案。

我認為你做的方式

asset.Close - asset.Close.shift(1)

頂部是您如何執行此操作的關鍵。 代替

if merged_data.loc[index-i,'Close_x'] > merged_data.loc[index-i-1,'Close_x']

使用 Close_x 中的更改創建一個列:

merged_data['Delta_Close_x'] = merged_data.Close_x - merged_data.Close_x.shift(1)

相似地,

if merged_data.loc[index-i,'return_stock'] > merged_data.loc[index-i-1,'return_sp']

成為

merged_data['vs_sp'] = merged_data.return_stock - merged_data.return_sp.shift(1)

然后你可以迭代 i 並使用像這樣的子集

merged_data[merged_data['Delta_Close_x'] > 0 and merged_data['vs_sp'] > 0]

還有很多額外的細節需要解決,但我希望這能讓你開始。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM