I have a large df at hand that looks like the following example, but with many more PERMNOs per day.I would like to apply a rolling two-period variance of daily returns for each PERMNO.
I know how to do this for each single period:
df['Monthly Variance'] = df.groupby(['PERMNO', 'Period'])['RET'].var()
But how do I do this for rolling periods? Eg every row with period 2019-05 should include the variance of all daily returns included in 2019-05 and 2019-04.
Data:
date Period PERMNO RET SPREAD
0 2019-03-19 2019-03 93436 -0.007496 0.037349
1 2019-03-29 2019-03 93436 0.004450 0.020619
2 2019-04-10 2019-04 93436 0.013771 0.020109
3 2019-04-23 2019-04 93436 0.004377 0.038514
4 2019-05-03 2019-05 93436 0.044777 0.053883
5 2019-05-15 2019-05 93436 -0.001550 0.031920
6 2019-05-28 2019-05 93436 -0.010124 0.038062
7 2019-06-07 2019-06 93436 -0.007041 0.036093
8 2019-06-19 2019-06 93436 0.007520 0.030354
9 2019-07-01 2019-07 93436 0.016602 0.030137
10 2019-07-12 2019-07 93436 0.027158 0.023654
11 2019-07-24 2019-07 93436 0.018104 0.030640
12 2019-08-05 2019-08 93436 -0.025689 0.024769
13 2019-08-15 2019-08 93436 -0.018122 0.047317
14 2019-08-27 2019-08 93436 -0.004279 0.031929
15 2019-09-09 2019-09 93436 0.019081 0.019762
16 2019-09-19 2019-09 93436 0.012773 0.012661
17 2019-10-01 2019-10 93436 0.015859 0.028520
18 2019-10-11 2019-10 93436 0.012871 0.017301
19 2019-10-23 2019-10 93436 -0.003521 0.019057
20 2019-11-04 2019-11 93436 0.013278 0.041001
21 2019-11-14 2019-11 93436 0.009361 0.031874
22 2019-11-26 2019-11 93436 -0.022061 0.025680
23 2019-12-09 2019-12 93436 0.010837 0.027964
24 2019-12-19 2019-12 93436 0.027699 0.026103
import pandas as pd
from pandas import Timestamp
d = {'date': {0: Timestamp('2019-03-19 00:00:00'),
1: Timestamp('2019-03-29 00:00:00'),
2: Timestamp('2019-04-10 00:00:00'),
3: Timestamp('2019-04-23 00:00:00'),
4: Timestamp('2019-05-03 00:00:00'),
5: Timestamp('2019-05-15 00:00:00'),
6: Timestamp('2019-05-28 00:00:00'),
7: Timestamp('2019-06-07 00:00:00'),
8: Timestamp('2019-06-19 00:00:00'),
9: Timestamp('2019-07-01 00:00:00'),
10: Timestamp('2019-07-12 00:00:00'),
11: Timestamp('2019-07-24 00:00:00'),
12: Timestamp('2019-08-05 00:00:00'),
13: Timestamp('2019-08-15 00:00:00'),
14: Timestamp('2019-08-27 00:00:00'),
15: Timestamp('2019-09-09 00:00:00'),
16: Timestamp('2019-09-19 00:00:00'),
17: Timestamp('2019-10-01 00:00:00'),
18: Timestamp('2019-10-11 00:00:00'),
19: Timestamp('2019-10-23 00:00:00'),
20: Timestamp('2019-11-04 00:00:00'),
21: Timestamp('2019-11-14 00:00:00'),
22: Timestamp('2019-11-26 00:00:00'),
23: Timestamp('2019-12-09 00:00:00'),
24: Timestamp('2019-12-19 00:00:00')},
'Period': {0: Period('2019-03', 'M'),
1: Period('2019-03', 'M'),
2: Period('2019-04', 'M'),
3: Period('2019-04', 'M'),
4: Period('2019-05', 'M'),
5: Period('2019-05', 'M'),
6: Period('2019-05', 'M'),
7: Period('2019-06', 'M'),
8: Period('2019-06', 'M'),
9: Period('2019-07', 'M'),
10: Period('2019-07', 'M'),
11: Period('2019-07', 'M'),
12: Period('2019-08', 'M'),
13: Period('2019-08', 'M'),
14: Period('2019-08', 'M'),
15: Period('2019-09', 'M'),
16: Period('2019-09', 'M'),
17: Period('2019-10', 'M'),
18: Period('2019-10', 'M'),
19: Period('2019-10', 'M'),
20: Period('2019-11', 'M'),
21: Period('2019-11', 'M'),
22: Period('2019-11', 'M'),
23: Period('2019-12', 'M'),
24: Period('2019-12', 'M')},
'PERMNO': {0: 93436,
1: 93436,
2: 93436,
3: 93436,
4: 93436,
5: 93436,
6: 93436,
7: 93436,
8: 93436,
9: 93436,
10: 93436,
11: 93436,
12: 93436,
13: 93436,
14: 93436,
15: 93436,
16: 93436,
17: 93436,
18: 93436,
19: 93436,
20: 93436,
21: 93436,
22: 93436,
23: 93436,
24: 93436},
'RET': {0: -0.007496,
1: 0.00445,
2: 0.013771,
3: 0.004377,
4: 0.044777,
5: -0.00155,
6: -0.010124,
7: -0.007041,
8: 0.00752,
9: 0.016602,
10: 0.027158,
11: 0.018104,
12: -0.025689,
13: -0.018122,
14: -0.004279,
15: 0.019081,
16: 0.012773,
17: 0.015859,
18: 0.012871,
19: -0.003521,
20: 0.013278,
21: 0.009361,
22: -0.022061,
23: 0.010837,
24: 0.027699},
'SPREAD': {0: 0.03734912462419806,
1: 0.02061930783242268,
2: 0.02010868822370299,
3: 0.03851421309872922,
4: 0.053883031997904014,
5: 0.031920088790233066,
6: 0.038062228476857696,
7: 0.03609261156529571,
8: 0.030353750113091504,
9: 0.030137440339402532,
10: 0.02365353870704016,
11: 0.030639552742658626,
12: 0.024769351113690646,
13: 0.04731741904986996,
14: 0.031929443946611374,
15: 0.019761767656938437,
16: 0.012661329848064019,
17: 0.028520051854639707,
18: 0.017300757667841702,
19: 0.01905709094660478,
20: 0.04100106573753255,
21: 0.03187425271937228,
22: 0.025680188759395033,
23: 0.027963531931584486,
24: 0.026103430012610333}}
df = pd.DataFrame(d)
If you use pandas resampling it works. Note you need to define a column that meets requirements for resampling. This effectively makes Period column redundant. You can also look into rollup()
as well. I've done an example of this as well.
df["ts"] = pd.to_datetime(df.date, unit="ms", utc=True)
df["Monthly Variance"] = df.groupby(["PERMNO"]).resample("M", on="ts")["RET"].transform("var")
df["Bi-Monthly Variance"] = df.groupby(["PERMNO"]).resample("2M", on="ts")["RET"].transform("var")
df["Quarterly Variance"] = df.groupby(["PERMNO"]).resample("Q", on="ts")["RET"].transform("var")
df["Yearly Variance"] = df.groupby(["PERMNO"]).resample("Y", on="ts")["RET"].transform("var")
df["Rolling Variance"] = df.rolling(10,on="ts")["RET"].var()
Only calculate latest data rather than whole data frame
dfsub = df[df["ts"]>=pd.to_datetime(Timestamp('2019-08-01 00:00:00'), unit="ms", utc=True)].copy()
df.loc[dfsub.index,"Bi-Monthly Variance"] = 0
df.loc[dfsub.index,"Bi-Monthly Variance"] = df.loc[dfsub.index,].groupby(["PERMNO"]).resample("2M", on="ts").transform("var")["RET"]
date Period PERMNO RET SPREAD ts Monthly Variance Bi-Monthly Variance Quarterly Variance Yearly Variance Rolling Variance
0 2019-03-19 2019-03 93436 -0.007496 0.037349 2019-03-19 00:00:00+00:00 0.000071 0.000071 0.000071 0.000268 NaN
1 2019-03-29 2019-03 93436 0.004450 0.020619 2019-03-29 00:00:00+00:00 0.000071 0.000071 0.000071 0.000268 NaN
2 2019-04-10 2019-04 93436 0.013771 0.020109 2019-04-10 00:00:00+00:00 0.000044 0.000448 0.000340 0.000268 NaN
3 2019-04-23 2019-04 93436 0.004377 0.038514 2019-04-23 00:00:00+00:00 0.000044 0.000448 0.000340 0.000268 NaN
4 2019-05-03 2019-05 93436 0.044777 0.053883 2019-05-03 00:00:00+00:00 0.000872 0.000448 0.000340 0.000268 NaN
5 2019-05-15 2019-05 93436 -0.001550 0.031920 2019-05-15 00:00:00+00:00 0.000872 0.000448 0.000340 0.000268 NaN
6 2019-05-28 2019-05 93436 -0.010124 0.038062 2019-05-28 00:00:00+00:00 0.000872 0.000448 0.000340 0.000268 NaN
7 2019-06-07 2019-06 93436 -0.007041 0.036093 2019-06-07 00:00:00+00:00 0.000106 0.000167 0.000340 0.000268 NaN
8 2019-06-19 2019-06 93436 0.007520 0.030354 2019-06-19 00:00:00+00:00 0.000106 0.000167 0.000340 0.000268 NaN
9 2019-07-01 2019-07 93436 0.016602 0.030137 2019-07-01 00:00:00+00:00 0.000033 0.000167 0.000374 0.000268 0.000261
10 2019-07-12 2019-07 93436 0.027158 0.023654 2019-07-12 00:00:00+00:00 0.000033 0.000167 0.000374 0.000268 0.000273
11 2019-07-24 2019-07 93436 0.018104 0.030640 2019-07-24 00:00:00+00:00 0.000033 0.000167 0.000374 0.000268 0.000275
12 2019-08-05 2019-08 93436 -0.025689 0.024769 2019-08-05 00:00:00+00:00 0.000118 0.000370 0.000374 0.000268 0.000410
13 2019-08-15 2019-08 93436 -0.018122 0.047317 2019-08-15 00:00:00+00:00 0.000118 0.000370 0.000374 0.000268 0.000475
14 2019-08-27 2019-08 93436 -0.004279 0.031929 2019-08-27 00:00:00+00:00 0.000118 0.000370 0.000374 0.000268 0.000284
15 2019-09-09 2019-09 93436 0.019081 0.019762 2019-09-09 00:00:00+00:00 0.000020 0.000370 0.000374 0.000268 0.000318
16 2019-09-19 2019-09 93436 0.012773 0.012661 2019-09-19 00:00:00+00:00 0.000020 0.000370 0.000374 0.000268 0.000308
17 2019-10-01 2019-10 93436 0.015859 0.028520 2019-10-01 00:00:00+00:00 0.000109 0.000214 0.000221 0.000268 0.000301
18 2019-10-11 2019-10 93436 0.012871 0.017301 2019-10-11 00:00:00+00:00 0.000109 0.000214 0.000221 0.000268 0.000304
19 2019-10-23 2019-10 93436 -0.003521 0.019057 2019-10-23 00:00:00+00:00 0.000109 0.000214 0.000221 0.000268 0.000304
20 2019-11-04 2019-11 93436 0.013278 0.041001 2019-11-04 00:00:00+00:00 0.000375 0.000214 0.000221 0.000268 0.000256
21 2019-11-14 2019-11 93436 0.009361 0.031874 2019-11-14 00:00:00+00:00 0.000375 0.000214 0.000221 0.000268 0.000236
22 2019-11-26 2019-11 93436 -0.022061 0.025680 2019-11-26 00:00:00+00:00 0.000375 0.000214 0.000221 0.000268 0.000214
23 2019-12-09 2019-12 93436 0.010837 0.027964 2019-12-09 00:00:00+00:00 0.000142 0.000142 0.000221 0.000268 0.000159
24 2019-12-19 2019-12 93436 0.027699 0.026103 2019-12-19 00:00:00+00:00 0.000142 0.000142 0.000221 0.000268 0.000185
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.