[英]Pandas- Finding first occurence in a row based on column values
我有以下 dataframe:
Row Bid_price Bid_volume Ask_price Ask_volume
2 2999.0 786.7 -500.0 1403.2
3 3000.0 786.7 -499.9 1407.2
4 2950.0 787.3 -250.1 1407.2
---------------------
56 125.1 2691 36.9 3113.1
57 125 2691.1 37 3133.1
---------------------
117 41.4 3029.7 2999 3835.7
118 40.05 3029.7 3000 3835.7
---------------------
123 39.4 3129.7 NaN NaN
124 36.1 3129.7 NaN NaN
125 36 3134.7 NaN NaN
我需要取第一对Bid_price and Bid_volume (2999.0 and 786.7)
并与所有对Ask_price and Ask_volume
进行比较。 只要Bid_volume < Ask_volume AND Bid_price > Ask_price
我跳转到下一对Bid_price and Bid_volume
并再次与所有对Ask_price and Ask_volume
进行比较。 Bid_Price
减少, Bid_Volume
增加, Ask_Price
增加, Ask_Volume
增加。 Bid_Price, Bid_Volume
的长度相同,但Ask_Price
和Ask_Volume
较短。
output 应该是Bid_volume is > Ask_volume AND Bid_price < Ask_price
的第一个实例,因此满足条件。 这是Bid_Price and Bid_Volume
对的第 124 行的情况,与 Ask_Price 和Ask_Price and Ask_Volume
对的第 56 行匹配。
所需的 output 应该是:
Row Bid_price Bid_volume
124 36.1 3129.7
Row Ask_price Ask_volume
56 36.9 3113.1
我的问题是我只能评估每一行的条件。 这不返回任何内容:
BidAsk = BidAsk[(BidAsk["Bid_volume"] > BidAsk["Ask_volume"]) & (BidAsk["Bid_price"] < BidAsk["Ask_price"])]
BidAsk[["Bid_price","Bid_volume"]]
这在这里给出了一个回溯错误:
BidAsk = BidAsk.where((BidAsk["Bid_volume"] > BidAsk["Ask_volume"]) & (BidAsk["Bid_Price"] < BidAsk["Ask_Price"]))
BidAsk[["Bid_price", "Bid_volume"]]
非常感谢任何帮助。 谢谢!
第一种方法不返回任何内容,因为这两个条件始终为假,您必须反转比较符号。
BidAsk = BidAsk[(BidAsk["Bid_volume"] < BidAsk["Ask_volume"]) & (BidAsk["Bid_price"] > BidAsk["Ask_price"])]
希望我理解正确,此脚本将找到满足条件Bid_volume is > Ask_volume AND Bid_price < Ask_price
Bid_price
Bid_volume
如果我有这个 dataframe:
Bid_price Bid_volume Ask_price Ask_volume
0 2999.0 786.7 -500.0 1403.2
1 3000.0 786.7 -499.9 1407.2
2 2950.0 787.3 -250.1 1407.2
3 2500.0 792.8 -250.0 1593.2
4 2000.0 798.9 -200.1 1593.2
5 1400.0 2000.0 1200.0 1600.0
6 36.1 3129.7 NaN NaN
然后:
import pandas as pd
from io import StringIO
txt = '''Bid_price Bid_volume Ask_price Ask_volume
2999.0 786.7 -500.0 1403.2
3000.0 786.7 -499.9 1407.2
2950.0 787.3 -250.1 1407.2
2500.0 792.8 -250.0 1593.2
2000.0 798.9 -200.1 1593.2
1400.0 2000.0 1200.0 1600.0
36.1 3129.7 '''
df = pd.read_fwf(StringIO(txt))
max_price = df.Ask_price.max()
max_volume = df.Ask_volume.max()
mask = pd.concat([df.Bid_price < max_price, df.Bid_volume > max_volume], axis=1).all(axis=1)
print( df.loc[mask, ['Bid_price', 'Bid_volume']].head(1) )
印刷:
Bid_price Bid_volume
6 36.1 3129.7
编辑(根据更新的问题):
import pandas as pd
from io import StringIO
txt = ''' Row Bid_price Bid_volume Ask_price Ask_volume
2 2999.0 786.7 -500.0 1403.2
3 3000.0 786.7 -499.9 1407.2
4 2950.0 787.3 -250.1 1407.2
56 125.1 2691 36.9 3113.1
57 125 2691.1 37 3133.1
117 41.4 3029.7 2999 3835.7
118 40.05 3029.7 3000 3835.7
123 39.4 3129.7 NaN NaN
124 36.1 3129.7 NaN NaN
125 36 3134.7 NaN NaN'''
df = pd.read_fwf(StringIO(txt))
def get_indexes(df):
for idx1, bid_price, bid_volume in zip(df.index, df.Bid_price, df.Bid_volume):
for idx2, ask_price, ask_volume in zip(df.index, df.Ask_price, df.Ask_volume):
if bid_volume > ask_volume and bid_price < ask_price:
return idx1, idx2, bid_price, bid_volume, ask_price, ask_volume
print(df)
print()
result = get_indexes(df)
if result:
print('Bid Price =', result[2])
print('Bid Volume =', result[3])
print('Ask Price =', result[4])
print('Ask Volume =', result[5])
print('Index bid =', result[0])
print('Index ask =', result[1])
印刷:
Row Bid_price Bid_volume Ask_price Ask_volume
0 2 2999.00 786.7 -500.0 1403.2
1 3 3000.00 786.7 -499.9 1407.2
2 4 2950.00 787.3 -250.1 1407.2
3 56 125.10 2691.0 36.9 3113.1
4 57 125.00 2691.1 37.0 3133.1
5 117 41.40 3029.7 2999.0 3835.7
6 118 40.05 3029.7 3000.0 3835.7
7 123 39.40 3129.7 NaN NaN
8 124 36.10 3129.7 NaN NaN
9 125 36.00 3134.7 NaN NaN
Bid Price = 36.1
Bid Volume = 3129.7
Ask Price = 36.9
Ask Volume = 3113.1
Index bid = 8
Index ask = 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.