Thank you in advance if you help me out with this. What I am trying to accomplish is to update a data frame filled with zeroes with a datetime index (my trade dataframe) using another dataframe (indexed_orders) on the same dates. My code is as follows:
import pandas as pd
import numpy as np
import os
import csv
orders = pd.read_csv('./orders/orders.csv', parse_dates=True, sep=',', dayfirst=True) #initiate orders data frame from csv data file
indexed_orders = orders.set_index(['Date']) #set Date as index for orders
print indexed_orders
symbol_list = orders['Symbol'].tolist() #creates list of symbols
symbols = list(set(symbol_list)) #gets rid of duplicates in list
dates_list = orders['Date'].tolist() #creates list of order dates
dates_orders = list(set(dates_list)) #gets rid of duplicates in list
start_date = '2011-01-05' #establish date range
end_date = '2011-01-20'
dates = pd.date_range(start_date, end_date) #establish dates from start_date and end_date
trade = pd.DataFrame(0, index = dates, columns = symbols) #establish trade data frame
trade['Cash'] = 0 #add column for future calculations
print trade
Which outputs for indexed_orders:
Date Symbol Order Shares
2011-01-10 AAPL BUY 1500
2011-01-13 AAPL SELL 1500
2011-01-13 IBM BUY 4000
2011-01-26 GOOG BUY 1000
2011-02-02 XOM SELL 4000
2011-02-10 XOM BUY 4000
2011-03-03 GOOG SELL 1000
2011-03-03 IBM SELL 2200
2011-06-03 IBM SELL 3300
2011-05-03 IBM BUY 1500
2011-06-10 AAPL BUY 1200
2011-08-01 GOOG BUY 55
2011-08-01 GOOG SELL 55
2011-12-20 AAPL SELL 1200
And outputs the following for trades:
GOOG AAPL XOM IBM Cash
2011-01-05 0 0 0 0 0
2011-01-06 0 0 0 0 0
2011-01-07 0 0 0 0 0
2011-01-08 0 0 0 0 0
2011-01-09 0 0 0 0 0
2011-01-10 0 0 0 0 0
2011-01-11 0 0 0 0 0
2011-01-12 0 0 0 0 0
2011-01-13 0 0 0 0 0
2011-01-14 0 0 0 0 0
2011-01-15 0 0 0 0 0
2011-01-16 0 0 0 0 0
2011-01-17 0 0 0 0 0
2011-01-18 0 0 0 0 0
2011-01-19 0 0 0 0 0
2011-01-20 0 0 0 0 0
I want to update my trades data frame on dates present in my idexed_orders, inserting the number of 'Shares' in the column under the correct 'Symbol' (which are the AAPL, IBM, GOOG, and XOM names in trades). I also want the value for 'Shares' to be negative when the 'Order' column in indexed_orders specifies 'SELL'. In other words, I am trying to come up with code that updates the trade data frame such that: print trade
GOOG AAPL XOM IBM Cash
2011-01-05 0 0 0 0 0
2011-01-06 0 0 0 0 0
2011-01-07 0 0 0 0 0
2011-01-08 0 0 0 0 0
2011-01-09 0 0 0 0 0
2011-01-10 0 1500 0 0 0
2011-01-11 0 0 0 0 0
2011-01-12 0 0 0 0 0
2011-01-13 0 -1500 0 4000 0
2011-01-14 0 0 0 0 0
2011-01-15 0 0 0 0 0
2011-01-16 0 0 0 0 0
2011-01-17 0 0 0 0 0
2011-01-18 0 0 0 0 0
2011-01-19 0 0 0 0 0
2011-01-20 0 0 0 0 0
I am thinking some sort of iteration with nested boolean statements is needed, but I am definitely having a hard time figuring one out. In particular, I am having difficulty coming up with a way to interate through the rows and updating based on indexed datetime.
Any help would be GREATLY appreciated.
First, you can use Order
column to sign the change in shares. Then, you can group by Date
and Symbol
and aggregate by summing orders. This would give you a Series
of orders for all unique days and Symbols
traded on those days. Finally, use unstack
to convert the Series
to tabular format.
import numpy as np
import pandas as pd
df = pd.io.parsers.read_csv('temp.txt', sep = '\t')
print df
'''
Date Symbol Order Shares
0 1/10/11 AAPL BUY 1500
1 1/13/11 AAPL SELL 1500
2 1/13/11 IBM BUY 4000
3 1/26/11 GOOG BUY 1000
4 2/2/11 XOM SELL 4000
5 2/10/11 XOM BUY 4000
6 3/3/11 GOOG SELL 1000
7 3/3/11 IBM SELL 2200
8 6/3/11 IBM SELL 3300
9 5/3/11 IBM BUY 1500
10 6/10/11 AAPL BUY 1200
11 8/1/11 GOOG BUY 55
12 8/1/11 GOOG SELL 55
13 12/20/11 AAPL SELL 1200
'''
df['SharesChange'] = df.Shares * df.Order.apply(lambda o: 1 if o == 'BUY' else -1)
df = df.groupby(['Date', 'Symbol']).agg({'SharesChange' : np.sum}).unstack().fillna(0)
print df
'''
SharesChange
Symbol AAPL GOOG IBM XOM
Date
1/10/11 1500 0 0 0
1/13/11 -1500 0 4000 0
1/26/11 0 1000 0 0
12/20/11 -1200 0 0 0
2/10/11 0 0 0 4000
2/2/11 0 0 0 -4000
3/3/11 0 -1000 -2200 0
5/3/11 0 0 1500 0
6/10/11 1200 0 0 0
6/3/11 0 0 -3300 0
8/1/11 0 0 0 0
'''
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.