如何遍历Pandas数据框并根据先前的行进行更新

Question

I have some code which I got to work but it's rather slow. 我有一些需要工作的代码，但是速度很慢。 I need to update a table of trades and quotes. 我需要更新交易表和报价表。 The base table is like this: 基表是这样的：

+--------+-----------+----------+----------+--------+----------+
| Symbol | Timestamp | BidPrice | AskPrice | Price  | Quantity |
+--------+-----------+----------+----------+--------+----------+
| MSFT   | 9:00      |          |          | 46.98  |      140 |
| MSFT   | 9:01      |          |          | 46.99  |      100 |
| MSFT   | 9:02      |          |          | 47     |      400 |
| MSFT   | 9:03      |          |          | 47     |      100 |
| MSFT   | 9:04      | 46.87    | 46.99    |        |          |
| MSFT   | 9:05      |          |          | 46.89  |      100 |
| MSFT   | 9:06      |          |          | 46.95  |      600 |
| MSFT   | 9:07      | 46.91    | 46.99    |        |          |
| MSFT   | 9:08      | 46.91    | 46.97    |        |          |
| MSFT   | 9:09      |          |          | 46.935 |      100 |
| MSFT   | 9:10      | 46.89    | 46.96    |        |          |
| MSFT   | 9:11      |          |          | 46.93  |      100 |
| MSFT   | 9:12      |          |          | 46.91  |      100 |
+--------+-----------+----------+----------+--------+----------+

I need to set the bid and price for each trade (there is a Price but no bid/ask). 我需要为每个交易设置出价和价格（有价格，但没有出价/要价）。 So starting with bid = 46.8 and ask = 47, set the values, and when those values change, set new values. 因此，以bid = 46.8并要求= 47开始，设置值，然后在这些值更改时设置新值。 Like this: 像这样：

+--------+-----------+----------+----------+--------+----------+
| Symbol | Timestamp | BidPrice | AskPrice | Price  | Quantity |
+--------+-----------+----------+----------+--------+----------+
| MSFT   | 9:00      | 46.8     | 47       | 46.98  |      140 |
| MSFT   | 9:01      | 46.8     | 47       | 46.99  |      100 |
| MSFT   | 9:02      | 46.8     | 47       | 47     |      400 |
| MSFT   | 9:03      | 46.8     | 47       | 47     |      100 |
| MSFT   | 9:04      | 46.87    | 46.99    |        |          |
| MSFT   | 9:05      | 46.87    | 46.99    | 46.89  |      100 |
| MSFT   | 9:06      | 46.87    | 46.99    | 46.95  |      600 |
| MSFT   | 9:07      | 46.91    | 46.99    |        |          |
| MSFT   | 9:08      | 46.91    | 46.97    |        |          |
| MSFT   | 9:09      | 46.91    | 46.97    | 46.935 |      100 |
| MSFT   | 9:10      | 46.89    | 46.96    |        |          |
| MSFT   | 9:11      | 46.89    | 46.96    | 46.93  |      100 |
| MSFT   | 9:12      | 46.89    | 46.96    | 46.91  |      100 |
+--------+-----------+----------+----------+--------+----------+

I worked this out iterating over rows, but for 112k rows, it takes 35 seconds. 我反复遍历了行，但是对于112k行，这需要35秒。

for i, row in qts_trd.iterrows():
    if np.isnan(row['Price']):
        bid = row['BidPrice']
        ask = row['AskPrice']        
    if np.isnan(row['BidPrice']):
        qts_trd.at[i,'BidPrice'] = bid
        qts_trd.at[i,'AskPrice'] = ask

I know the basics of lambda functions, applying the same one to every row. 我知道lambda函数的基础，将相同的函数应用于每一行。 I think it's quicker, but as you see it changes. 我认为它更快，但是正如您所见，它会发生变化。 Is there any more efficient/quicker way to do it? 有没有更有效/快捷的方法来做到这一点？

This is Python 3.7 in Spyder. 这是Spyder中的Python 3.7。

Answer 1

Try pandas fillna() function using the method='ffill' 使用method='ffill'尝试pandas fillna（）函数

So: 所以：

qts_trd.BidPrice.fillna(method='ffill', inplace=True)
qts_trd.AskPrice.fillna(method='ffill', inplace=True)

In my experience it's very quick 以我的经验，它很快

Edit: 编辑：

I just realised this wont fill your first values, the below code will insert a row at the top to fill from, and then delete it. 我只是意识到这不会填充您的第一个值，下面的代码将在顶部插入一行以进行填充，然后将其删除。

qts_trd.loc[-1] = ['', '', 46.8, 47, '', '']
qts_trd.index += 1
qts_trd.sort_index(inplace=True)
qts_trd.BidPrice.fillna(method='ffill', inplace=True)
qts_trd.AskPrice.fillna(method='ffill', inplace=True)
qts_trd.drop(0,0,inplace=True)
qts_trd.reset_index(drop=True, inplace=True)

Edit 2.0...thanks to @no_body 's comment: 编辑2.0 ...感谢@no_body的评论：

qts_trd.BidPrice.fillna(method='ffill', inplace=True).fillna(46.8)
qts_trd.AskPrice.fillna(method='ffill', inplace=True).fillna(47)

如何遍历Pandas数据框并根据先前的行进行更新

问题描述

1 个解决方案

解决方案1
1 2019-02-28 17:29:09

如何遍历Pandas数据框并根据先前的行进行更新

问题描述

1 个解决方案

解决方案1 1 2019-02-28 17:29:09

解决方案1
1 2019-02-28 17:29:09