[英]Grouped 3 monthly aggregation and shifting periods in pandas python
The problem
问题
I have a dataframe with many regions and their respective units sold, visits performed and average visit times on a monthly basis.我有一个数据框,其中包含许多地区及其各自的销售单位、执行的访问次数和每月的平均访问次数。 Not all the regions have the same starting date.
并非所有地区都有相同的开始日期。
So my table looks something like this:所以我的桌子看起来像这样:
Region Month Visits Average_minutes Units_sold
Region_1 2018.01.01 12 2.22 120
Region_1 2018.02.01 10 2.02 108
Region_2 2017.04.01 4 1.8 60
Region_2 2017.05.01 4 1.6 56
Region_2 2017.06.01 3 1.5 58
Region_1 2018.03.01 11 2.1 103
Region_3 2018.04.01 3 2.22 20
Region_3 2018.05.01 2 2 22
Region_2 2017.07.01 6 1.7 61
Region_1 2018.04.01 14 2.1 125
Region_3 2018.06.01 3 2.3 21
Region_3 2018.07.01 3 2.4 19
Region_1 2018.05.01 10 2.12 116
Region_2 2017.08.01 3 2.1 55
What I would like to have is aggregate the monthly data for the different regions in 3 months frequencies by shifting one month forward.我想要的是通过向前移动一个月,以 3 个月的频率汇总不同地区的月度数据。
So if we take Region_1 for example, the end result I would like to get is something like this:所以如果我们以 Region_1 为例,我想得到的最终结果是这样的:
Region Date Visits Average_minutes Units_sold 3M_shift
Region_1 2018.01.01 33 2.11 331 0
Region_1 2018.04.01 24 2.11 241 0
Region_1 2018.02.01 35 2.07 336 1
Region_1 2018.05.01 10 2.12 116 1
Region_1 2018.02.01 35 2.07 336 2
Region_1 2018.05.01 10 2.12 116 2
As you can see the Date now contains the starting date of the 3 month frequency and in the 3M_shift column I see the shifts made compared to the first available month.如您所见,日期现在包含 3 个月频率的开始日期,并且在 3M_shift 列中,我看到与第一个可用月份相比所做的转变。
Of course in the table above you can see Region_1 only but i would like to get this result for all the groups.当然,在上表中,您只能看到 Region_1,但我想为所有组获得此结果。
More background
更多背景
So I would like to have data per groups aggregated not only business year quarters but on 3 month frequency shifting by one month forwards for every iteration till I get to the last month.因此,我希望每个组的数据不仅汇总营业年度季度,而且每次迭代前 3 个月的频率向前移动一个月,直到我到达最后一个月。
My code looks like this, but this groups the months from the starting date of each region and I don't really know how to shift the starting month by one and iterate till the last month:我的代码看起来像这样,但它从每个区域的开始日期开始对月份进行分组,我真的不知道如何将开始月份移动一个并迭代到最后一个月:
grp = joined.groupby(['Region', pd.Grouper(key="Date", freq='3M')]).agg({"Visits":"sum", "Average_minutes":"mean", "Units_sold":"sum"})
So for Region_1 for example I get this result:例如,对于 Region_1,我得到了这个结果:
Region Date Visits Average_minutes Units_sold
Region_1 2018.01.01 33 2.11 331
Region_1 2018.04.01 24 2.11 241
Edit: Added a better visualisation of what I would like to get.编辑:添加了我想要得到的更好的可视化。
In the picture below you can see what I mean.在下面的图片中,您可以看到我的意思。 The green part is what I have so far.
绿色部分是我到目前为止所拥有的。 I would like to make a loop for the pink part, but I do not know how to do it.
我想为粉红色的部分做一个循环,但我不知道该怎么做。
Could you please help me to get the desired outcome?你能帮我得到想要的结果吗?
Thank you very much in advance!非常感谢您提前!
I'm not 100% sure what you are looking for, but the way I interpret, maybe this will help?我不是 100% 确定你在找什么,但我解释的方式,也许这会有所帮助?
First sort Region and Month.首先排序地区和月份。
df = df.sort_values(['Region', 'Month'])
The set a multi index.设置多索引。
df = df.set_index(['Region', 'Month'])
Then groupby the region and apply a rolling window for aggregating and shift it back two periods.然后按区域分组并应用滚动窗口进行聚合并将其移回两个时期。
df = df.groupby(level='Region').apply(lambda x: x.rolling(window=3).agg({"Visits":"sum", "Average_minutes":"mean", "Units_sold":"sum"}).shift(-2))
The result is:结果是:
Visits Average_minutes Units_sold
Region Month
Region_1 2018.01.01 33.0 2.113333 331.0
2018.02.01 35.0 2.073333 336.0
2018.03.01 35.0 2.106667 344.0
2018.04.01 NaN NaN NaN
2018.05.01 NaN NaN NaN
Region_2 2017.04.01 11.0 1.633333 174.0
2017.05.01 13.0 1.600000 175.0
2017.06.01 12.0 1.766667 174.0
2017.07.01 NaN NaN NaN
2017.08.01 NaN NaN NaN
Region_3 2018.04.01 8.0 2.173333 63.0
2018.05.01 8.0 2.233333 62.0
2018.06.01 NaN NaN NaN
2018.07.01 NaN NaN NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.