简体   繁体   中英

'Conditional' groupby in pandas DataFrame

I am calculating market beta using daily data with pandas.DataFrame. That is, I want to calculate variances of market return and covariances between market return and individual stock return using 252 days window with 200 minimum observation conditions with groupby. Beta is Var(market_return)/Cov(market_return, stock_return). First, I used unconditional groupby to obtain the beta value, which means that I calculate the variances and covariances for every day of my data sample. However, then, I realize that calculating all betas consumes too much time and is wasteful. This is because only end-of-the-month data will be used. For example, even if betas are calculated on 1st Jan, 2nd Jan, ..., and 31st Jan, only the beta of 31st Jan will be used. Therefore, I want to know if there is any way to run my groupby code conditionally.

For example, my output is as follows using 252 days window with 200 minimum observation groupby.

stock_key date var(market_return) covar(market_return, stock_return)
A 2012-01-26 9.4212 -4.23452
A 2012-01-27 9.3982 -4.18421
A 2012-01-28 9.1632 -4.33552
A 2012-01-29 9.0456 -4.55831
A 2012-01-30 9.2231 -4.92373
A 2012-01-31 9.0687 -4.04133
...
A 2012-02-27 8.9345 -4.72344
A 2012-02-28 9.0010 -4.82349
...
B 2012-01-26 4.8456 -1.42325
B 2012-01-27 4.8004 -1.18421
B 2012-01-28 4.0983 -1.02842
B 2012-01-29 4.9465 -1.13834
B 2012-01-30 4.7354 -1.63450
B 2012-01-31 4.1945 -1.18234

I want to know is there any way to get result as follows.

stock_key date var(market_return) covar(market_return, stock_return)
A 2012-01-31 9.0687 -4.04133
A 2012-02-28 9.0010 -4.82349
B 2012-01-31 4.1945 -1.18234

Thank you for reading my question.

+ I add my code as follows. Here, PERMNO is the id of stocks.

dtmpPair = dlongPair[['PERMNO','dayMktmRF','eadjret']]
dgrpPair = dtmpPair.groupby(['PERMNO']).rolling(252, min_periods = 200)
dgrpCov = dgrpPair.cov().unstack()

ddenom = dgrpCov['dayMktmRF']['dayMktmRF'].reset_index()
ddenom = ddenom[['PERMNO','dayMktmRF']]
ddenom['date'] = dlongPair['date']
ddenom.columns = ['PERMNO','MktVar','date']

dnumer = dgrpCov['dayMktmRF']['eadjret'].reset_index()
dnumer = dnumer[['PERMNO','eadjret']]
dnumer['date'] = dlongPair['date']
dnumer.columns = ['PERMNO','Cov','date']

ddfBeta = dnumer.merge(ddenom, on = ['PERMNO','date'])
ddfBeta['beta_daily'] = ddfBeta['Cov'] / ddfBeta['MktVar']
ddfBeta = ddfBeta[ddfBeta['beta_daily'].notnull()]
ddfBeta['month'] = ddfBeta['date'].dt.month
ddfBeta['year'] = ddfBeta['date'].dt.year
beta_daily = ddfBeta[['date','PERMNO','year','month','beta_daily']]

Here, dlongPair dataframe consists of data as follows. 在此处输入图像描述

Without using groupby we can check if the date in the row is the last day of the month.

df['date']=pd.to_datetime(df['date']) #string to datetime

#Is the date in the row the last day of that month?
dfx=df[df['date']  - pd.offsets.Day() + pd.offsets.MonthEnd(1)==df['date']]

Output :

   stock_key       date  var(market_return)  covar(market_return, stock_return)
5          A 2012-01-31              9.0687                            -4.04133
15         B 2012-01-31              4.1945                            -1.18234

Note : 2012-02's last day is 29.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM