[英]Aggregating weekly data by group into monthly sums in pandas
This seems pretty straightforward to do but I'm very new to pandas and I'm not sure where to start.这看起来很简单,但我对 pandas 很陌生,我不知道从哪里开始。 I have a dataset that contains weekly data for multiple clinics.我有一个数据集,其中包含多个诊所的每周数据。 Every week begins on a Sunday and ends on a Saturday.每周从星期日开始,到星期六结束。 I'd like to aggregate it into monthly data and keep it sorted by clinic.我想将其汇总为每月数据并按诊所分类。
This is what it currently looks like:这是它目前的样子:
In [2]: df
Out[2]:
Week Clinic Appointments Cancellations
2021-11-28 to 2021-12-04 fee 40 4
2021-11-28 to 2021-12-04 fi 21 2
2021-12-05 to 2021-12-11 fee 36 3
2022-02-20 to 2022-02-26 fee 10 1
2022-02-27 to 2022-03-05 fee 45 3
2022-02-27 to 2022-03-05 fi 30 1
TOTAL (all clinics) --- 182 14
And this is what I want it to become:这就是我希望它变成的样子:
Month Clinic Appointments Cancellations
Nov '21 fee 40 4
Nov '21 fi 21 2
Dec '21 fee 36 3
Feb '22 fee 55 4
Feb '22 fi 30 1
TOTAL --- 182 14
So the way that I would group a week with a month is if the beginning date (the Sunday) falls within that month.因此,我将一周与一个月分组的方式是,如果开始日期(星期日)在该月内。 Also, not all clinics will have data for every week.此外,并非所有诊所每周都有数据。
What I've tried:我试过的:
I've been trying to use我一直在尝试使用
df.groupby(['Clinic', 'Week']) df.groupby(['诊所', '周'])
but from there I'm not sure how to aggregate the sorted weekly data and return it as a new excel worksheet in the format I want.但从那里我不确定如何聚合排序的每周数据并将其作为我想要的格式的新 excel 工作表返回。 Any hints would be welcome.欢迎任何提示。
'Week' is not in the year_month format you need in your expected output, so you need to first convert them into year_month
by: “周”不是您预期year_month
中所需的年月格式,因此您需要先将它们转换为年月格式:
date = df['Week'].str.split(' ', expand=True)[0]
year_month = pd.to_datetime(date, errors='coerce').dt.strftime('%Y-%b').fillna(date)
before you use groupby
:在使用groupby
之前:
df.groupby([year_month, 'Clinic']).sum()
just to add to the above comment from Raymond, using:只是为了添加雷蒙德的上述评论,使用:
dt.strftime('%Y-%m')
instead of代替
dt.strftime('%Y-%b')
will sort correctly the output.将正确排序 output。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.