I have a df
:
sales net_pft
STK_ID RPT_Date
600141 20101231 46.780 1.833
20110331 13.725 0.384
20110630 32.733 1.132
20110930 50.386 1.923
20111231 65.685 2.325
20120331 21.088 0.656
20120630 46.952 1.591
600809 20101231 30.166 4.945
20110331 18.724 5.061
20110630 28.948 6.586
20110930 35.637 7.075
20111231 44.882 7.805
20120331 22.140 4.925
20120630 38.157 7.868
And I want to do a rolling average for all columns, after groupby STK_ID
, the rule expressed by pseudocode like :
if RPT_Date[4:8] == '0331':
all_column = rolling_mean(all_column,2)
if RPT_Date[4:8] == '0630':
all_column = rolling_mean(all_column,3)
if RPT_Date[4:8] == '0930':
all_column = rolling_mean(all_column,4)
if RPT_Date[4:8] == '1231':
all_column = rolling_mean(all_column,5)
if is_the_first_row():
keep_original_values()
all_column
here stands for ' sales
', 'net_pft'
. The final result would like :
sales net_pft
STK_ID RPT_Date
600141 20101231 46.780 1.833 # same as original value
20110331 30.253 1.109 # average of row1&row2
20110630 31.079 1.116 # average of row1&row2&row3
......
600809 20101231 30.166 4.945 # same as original value
20110331 24.445 5.003 # average of row1&row2
.....
How to write in neat Pandas expression?
I think you want this?
In [29]: df.groupby(level='STK_ID').apply(lambda x: pd.expanding_mean(x))
Out[29]:
sales net_pft
STK_ID RPT_Date
600141 20101231 46.780000 1.833000
20110331 30.252500 1.108500
20110630 31.079333 1.116333
20110930 35.906000 1.318000
20111231 41.861800 1.519400
20120331 38.399500 1.375500
20120630 39.621286 1.406286
600809 20101231 30.166000 4.945000
20110331 24.445000 5.003000
20110630 25.946000 5.530667
20110930 28.368750 5.916750
20111231 31.671400 6.294400
20120331 30.082833 6.066167
20120630 31.236286 6.323571
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.