简体   繁体   中英

Efficiently adding rows to pandas DataFrame

I'm trying to create a simple backtester on python which allows me to assess the performance of a trading strategy. As part of the backtester, I need to record the transactions which occur. Eg,

Day     Stock    Action     Quantity
1       AAPL     BUY        20
2       CSCO     SELL       30
2       AMZN     SELL       50

During the trading simulation, I'll need to add more transactions.

What's the most efficient way to do this. Should I create a transactions list at the start of the simulation and append lists such as [5, 'AAPL', 'BUY', 20] as I go. Should I instead use a dictionary, or a numpy array? Or just a Pandas DataFrame directly?

Thanks,

Jack

list.append operations are amortised constant time operations, because it just involves shifting pointers around.

OTOH, numpy.ndarray and pd.DataFrame objects are internally represented as arrays in C, and those are immutable. Each time you "append" to an array/dataframe, you have to reallocate new memory for an entire copy of the old data plus the appended, and ends up being linear in complexity.

So, as @ayhan said in a comment , accumulate your data in a list , and then load into a dataframe once you're done.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM