I have a list of 10 stocks differentiated by PERMNO. I would like to group those stocks by PERMNO and calculate the rolling correlation between the stock return (RET) for each PERMNO with the market return (vwretd). The code I am trying is below.
CRSP['rollingcorr'] = CRSP.groupby('PERMNO').rolling_corr(CRSP['RET'],CRSP['vwretd'],10)
The error I am getting is below.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-32-c18e1ce01302> in <module>()
1 #CRSP['rollingcorr'] = CRSP.rolling_corr(CRSP['vwretd'],CRSP['RET'],120)
----> 2 CRSP['rollingmean'] = CRSP.groupby('PERMNO').rolling_corr(CRSP['vwretd'],10)
3 CRSP.head(20)
C:\Users\rebortz\Anaconda\lib\site-packages\pandas\core\groupby.pyc in __getattr__(self, attr)
296
297 raise AttributeError("%r object has no attribute %r" %
--> 298 (type(self).__name__, attr))
299
300 def __getitem__(self, key):
AttributeError: 'DataFrameGroupBy' object has no attribute 'rolling_corr'
please help!
Thanks
Running rolling.corr()
on Python 3.5 generates a warning the function is deprecated and may stop working in future. Using Series.rolling(window=<period>).corr(other=series)
instead is recommended. Eg
data['scrip1DailyReturn'].rolling(window=90).corr(other=data['scrip2DailyReturn'])
Use pandas.rolling_corr
, not DataFrame.rolling_corr
. Besides, groupby
returns a generator. See below code.
Code:
import pandas as pd
df = pd.read_csv("color.csv")
df_gen = df.copy().groupby("Color")
for key, value in df_gen:
print "key: {}".format(key)
print value.rolling_corr(value["Value1"],value["Value2"], 3)
Output:
key: Blue
1 NaN
3 NaN
6 0.931673
8 0.865066
10 0.089304
12 -0.998656
15 -0.971373
17 -0.667316
dtype: float64
key: Red
0 NaN
2 NaN
5 -0.911357
9 -0.152221
11 -0.971153
14 0.438697
18 -0.550727
dtype: float64
key: Yellow
4 NaN
7 NaN
13 -0.040330
16 0.879371
dtype: float64
You can change the loop part to the following to view the original dataframe post-grouping with a new column as well.
for key, value in df_gen:
value["ROLL_CORR"] = pd.rolling_corr(value["Value1"],value["Value2"], 3)
print value
Output:
Color Value1 Value2 ROLL_CORR
1 Blue 0.951227 0.514999 NaN
3 Blue 0.649112 0.513052 NaN
6 Blue 0.148165 0.342205 0.931673
8 Blue 0.626883 0.421530 0.865066
10 Blue 0.286738 0.583811 0.089304
12 Blue 0.966779 0.227340 -0.998656
15 Blue 0.065493 0.887640 -0.971373
17 Blue 0.757932 0.900103 -0.667316
key: Red
Color Value1 Value2 ROLL_CORR
0 Red 0.201435 0.981871 NaN
2 Red 0.522955 0.357239 NaN
5 Red 0.806326 0.310039 -0.911357
9 Red 0.656126 0.678047 -0.152221
11 Red 0.435898 0.908388 -0.971153
14 Red 0.116419 0.555821 0.438697
18 Red 0.793102 0.168033 -0.550727
key: Yellow
Color Value1 Value2 ROLL_CORR
4 Yellow 0.099474 0.143293 NaN
7 Yellow 0.073128 0.749297 NaN
13 Yellow 0.006777 0.318383 -0.040330
16 Yellow 0.345647 0.993382 0.879371
If you want to join them all together after processing (this might be confusing to others, by the way), just use concat
after processing groups.
import pandas as pd
df = pd.read_csv("color.csv")
df_gen = df.copy().groupby("Color")
dfs = [] # Container for dataframes.
for key, value in df_gen:
value["ROLL_CORR"] = pd.rolling_corr(value["Value1"],value["Value2"], 3)
print value
dfs.append(value)
df_final = pd.concat(dfs)
print df_final
Output:
Color Value1 Value2 ROLL_CORR
1 Blue 0.951227 0.514999 NaN
3 Blue 0.649112 0.513052 NaN
6 Blue 0.148165 0.342205 0.931673
8 Blue 0.626883 0.421530 0.865066
10 Blue 0.286738 0.583811 0.089304
12 Blue 0.966779 0.227340 -0.998656
15 Blue 0.065493 0.887640 -0.971373
17 Blue 0.757932 0.900103 -0.667316
0 Red 0.201435 0.981871 NaN
2 Red 0.522955 0.357239 NaN
5 Red 0.806326 0.310039 -0.911357
9 Red 0.656126 0.678047 -0.152221
11 Red 0.435898 0.908388 -0.971153
14 Red 0.116419 0.555821 0.438697
18 Red 0.793102 0.168033 -0.550727
4 Yellow 0.099474 0.143293 NaN
7 Yellow 0.073128 0.749297 NaN
13 Yellow 0.006777 0.318383 -0.040330
16 Yellow 0.345647 0.993382 0.879371
Hope this helps.
I found an efficient solution. Fairly simple.
def roll_corr_groupby(x,i):
x['Z'] = rolling_corr(x['col 1'], x['col 2'],i)
return x
x.groupby(['key']).apply(roll_corr_groupby)
x.head()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.