簡體   English   中英

Pandas-根據列值在一行中查找第一次出現

[英]Pandas- Finding first occurence in a row based on column values

我有以下 dataframe:

 Row    Bid_price   Bid_volume  Ask_price   Ask_volume
 2      2999.0      786.7      -500.0       1403.2
 3      3000.0      786.7      -499.9       1407.2
 4      2950.0      787.3      -250.1       1407.2
---------------------
 56     125.1       2691       36.9         3113.1
 57     125         2691.1     37           3133.1
---------------------
 117    41.4        3029.7     2999         3835.7
 118    40.05       3029.7     3000         3835.7
---------------------
 123    39.4        3129.7     NaN          NaN
 124    36.1        3129.7     NaN          NaN
 125    36          3134.7     NaN          NaN

我需要取第一對Bid_price and Bid_volume (2999.0 and 786.7)並與所有對Ask_price and Ask_volume進行比較。 只要Bid_volume < Ask_volume AND Bid_price > Ask_price我跳轉到下一對Bid_price and Bid_volume並再次與所有對Ask_price and Ask_volume進行比較。 Bid_Price減少, Bid_Volume增加, Ask_Price增加, Ask_Volume增加。 Bid_Price, Bid_Volume的長度相同,但Ask_PriceAsk_Volume較短。

output 應該是Bid_volume is > Ask_volume AND Bid_price < Ask_price的第一個實例,因此滿足條件。 這是Bid_Price and Bid_Volume對的第 124 行的情況,與 Ask_Price 和Ask_Price and Ask_Volume對的第 56 行匹配。

所需的 output 應該是:

Row      Bid_price    Bid_volume  
124      36.1         3129.7

Row      Ask_price    Ask_volume
56       36.9         3113.1

我的問題是我只能評估每一行的條件。 這不返回任何內容:

BidAsk = BidAsk[(BidAsk["Bid_volume"] > BidAsk["Ask_volume"]) & (BidAsk["Bid_price"] < BidAsk["Ask_price"])]
BidAsk[["Bid_price","Bid_volume"]]

這在這里給出了一個回溯錯誤:

BidAsk = BidAsk.where((BidAsk["Bid_volume"] > BidAsk["Ask_volume"]) & (BidAsk["Bid_Price"] < BidAsk["Ask_Price"]))
BidAsk[["Bid_price", "Bid_volume"]]

非常感謝任何幫助。 謝謝!

第一種方法不返回任何內容,因為這兩個條件始終為假,您必須反轉比較符號。

BidAsk = BidAsk[(BidAsk["Bid_volume"] < BidAsk["Ask_volume"]) & (BidAsk["Bid_price"] > BidAsk["Ask_price"])]

希望我理解正確,此腳本將找到滿足條件Bid_volume is > Ask_volume AND Bid_price < Ask_price Bid_price Bid_volume

如果我有這個 dataframe:

   Bid_price  Bid_volume  Ask_price  Ask_volume
0     2999.0       786.7     -500.0      1403.2
1     3000.0       786.7     -499.9      1407.2
2     2950.0       787.3     -250.1      1407.2
3     2500.0       792.8     -250.0      1593.2
4     2000.0       798.9     -200.1      1593.2
5     1400.0      2000.0     1200.0      1600.0
6       36.1      3129.7        NaN         NaN

然后:

import pandas as pd
from io import StringIO

txt = '''Bid_price   Bid_volume  Ask_price   Ask_volume
2999.0      786.7      -500.0       1403.2
3000.0      786.7      -499.9       1407.2
2950.0      787.3      -250.1       1407.2
2500.0      792.8      -250.0       1593.2
2000.0      798.9      -200.1       1593.2
1400.0     2000.0      1200.0       1600.0
  36.1     3129.7             '''

df = pd.read_fwf(StringIO(txt))

max_price = df.Ask_price.max()
max_volume = df.Ask_volume.max()

mask = pd.concat([df.Bid_price < max_price, df.Bid_volume > max_volume], axis=1).all(axis=1)

print( df.loc[mask, ['Bid_price', 'Bid_volume']].head(1) )

印刷:

   Bid_price  Bid_volume
6       36.1      3129.7

編輯(根據更新的問題):

import pandas as pd
from io import StringIO

txt = ''' Row    Bid_price   Bid_volume  Ask_price   Ask_volume
 2      2999.0      786.7      -500.0       1403.2
 3      3000.0      786.7      -499.9       1407.2
 4      2950.0      787.3      -250.1       1407.2
 56     125.1       2691       36.9         3113.1
 57     125         2691.1     37           3133.1
 117    41.4        3029.7     2999         3835.7
 118    40.05       3029.7     3000         3835.7
 123    39.4        3129.7     NaN          NaN
 124    36.1        3129.7     NaN          NaN
 125    36          3134.7     NaN          NaN'''

df = pd.read_fwf(StringIO(txt))

def get_indexes(df):
    for idx1, bid_price, bid_volume in zip(df.index, df.Bid_price, df.Bid_volume):
        for idx2, ask_price, ask_volume in zip(df.index, df.Ask_price, df.Ask_volume):
            if bid_volume > ask_volume and bid_price < ask_price:
                return idx1, idx2, bid_price, bid_volume, ask_price, ask_volume

print(df)
print()

result = get_indexes(df)
if result:
    print('Bid Price   =', result[2])
    print('Bid Volume  =', result[3])
    print('Ask Price   =', result[4])
    print('Ask Volume  =', result[5])
    print('Index bid   =', result[0])
    print('Index ask   =', result[1])

印刷:

   Row  Bid_price  Bid_volume  Ask_price  Ask_volume
0    2    2999.00       786.7     -500.0      1403.2
1    3    3000.00       786.7     -499.9      1407.2
2    4    2950.00       787.3     -250.1      1407.2
3   56     125.10      2691.0       36.9      3113.1
4   57     125.00      2691.1       37.0      3133.1
5  117      41.40      3029.7     2999.0      3835.7
6  118      40.05      3029.7     3000.0      3835.7
7  123      39.40      3129.7        NaN         NaN
8  124      36.10      3129.7        NaN         NaN
9  125      36.00      3134.7        NaN         NaN

Bid Price   = 36.1
Bid Volume  = 3129.7
Ask Price   = 36.9
Ask Volume  = 3113.1
Index bid   = 8
Index ask   = 3

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM