Here's my data structure:
date_time ticker stock_price type bid ask impVol symbol strike_price delta vega gamma theta rho diff
371 2021-02-19 14:28:45 AMZN 3328.23 put 44.5 46.85 NaN AMZN210226P03330000 3330.0 NaN NaN NaN NaN NaN 1.77
370 2021-02-19 14:28:45 AMZN 3328.23 call 43.5 45.80 NaN AMZN210226C03330000 3330.0 NaN NaN NaN NaN NaN 1.77
1066 2021-02-19 14:28:55 AMZN 3328.23 call 43.5 45.80 NaN AMZN210226C03330000 3330.0 NaN NaN NaN NaN NaN 1.77
1067 2021-02-19 14:28:55 AMZN 3328.23 put 44.5 46.85 NaN AMZN210226P03330000 3330.0 NaN NaN NaN NaN NaN 1.77
My goal is to group the date_time, then create a column for put's bid and ask and call's bid and ask.
My expected output would be something like this:
date_time ticker stock_price put_bid put_ask call_bid call_ask impVol symbol strike_price delta vega gamma theta rho diff
371 2021-02-19 14:28:45 AMZN 3328.23 44.5 46.85 43.5 45.80 NaN AMZN210226P03330000 3330.0 NaN NaN NaN NaN NaN 1.77
1066 2021-02-19 14:28:55 AMZN 3328.23 43.5 45.80 44.5 46.85 NaN AMZN210226C03330000 3330.0 NaN NaN NaN NaN NaN 1.77
I tried everything I can find for examples, including pivoting such as this:
df=pd.pivot_table(df,index=['date_time','type'],columns=df.groupby(['date_time','type']).cumcount().add(1),values=['market_price'],aggfunc='sum')
df.columns=df.columns.map('{0[0]}{0[1]}'.format)
I think I'm on the right path, but I just can't figure it out. Any help would be incredibly appreciated.
Why are you trying to use a groupby? pandas.pivot()
does the grouping for you.
You haven't provided a reproducible example (hint: please do next time) so I made up some random data to explain a possible solution. Note this is not identical to what you need but it's a starting point:
import numpy as np
import pandas as pd
df = pd.DataFrame()
df['period'] = np.repeat([1,2],2)
df['product'] = 'kiwi'
df['type'] = np.tile(['buy','sell'],2)
df['price'] = np.arange(1,5)
out = pd.pivot_table(df, index =['period','product'], columns = ['type'] , values ='price' )
You need to specify what you want on the left (index), what you want on the top (columns) and which values (values) you want to show for this combination.
Also, are you sure the date time will be the same? What if in the first two rows it's even only one second off - is that possible? And what if the stock price is different between the first and the 2nd row of your table? I don't know your data so no idea if that is possible, but it's something to think about.
Also note that my example does not specify an aggregate function, so it defaults to the mean. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.pivot_table.html
To use a pivot table to reorient your data the way you're describing, you'll need to include all columns which vary with type, which in this case includes "symbol" (note the P vs. C in the code):
In [10]: pivoted = df.pivot(
...: index=['date_time', 'ticker', 'stock_price', 'impVol', 'strike_price','delta','vega', 'gamma','theta','rho','diff'],
...: columns=['type', 'symbol'],
...: values=['bid', 'ask'],
...: )
In [11]: pivoted
Out[11]:
bid ask
type put call put call
symbol AMZN210226P03330000 AMZN210226C03330000 AMZN210226P03330000 AMZN210226C03330000
date_time ticker stock_price impVol strike_price delta vega gamma theta rho diff
2021-02-19 14:28:45 AMZN 3328.23 NaN 3330.0 NaN NaN NaN NaN NaN 1.77 44.5 43.5 46.85 45.8
2021-02-19 14:28:55 AMZN 3328.23 NaN 3330.0 NaN NaN NaN NaN NaN 1.77 44.5 43.5 46.85 45.8
If you'd like, you could then relabel your columns:
In [12]: pivoted.columns = pd.Index([i[0] + '_' + i[1] for i in pivoted.columns.values])
In [13]: pivoted
Out[13]:
bid_put bid_call ask_put ask_call
date_time ticker stock_price impVol strike_price delta vega gamma theta rho diff
2021-02-19 14:28:45 AMZN 3328.23 NaN 3330.0 NaN NaN NaN NaN NaN 1.77 44.5 43.5 46.85 45.8
2021-02-19 14:28:55 AMZN 3328.23 NaN 3330.0 NaN NaN NaN NaN NaN 1.77 44.5 43.5 46.85 45.8
Alternatively, you could just exclude symbol from the index, but either way, you need to either stack symbol, drop it, or manually handle it some way because the data is not the same for each "type".
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.