[英]How can I Group By Month from a Date field with Python/Pandas
I have a Dataframe df as follows: 我有一个Dataframe df,如下所示:
date value_1 value_2
2018.07.06 10 0
2018.07.14 20 1
2018.07.27 20 2
2018.08.06 30 1
2018.08.09 40 3
2018.08.13 20 2
2018.09.10 30 1
2018.09.22 50 2
2018.10.09 20 3
2018.10.27 20 1
I need to group the above data by month to get output as: 我需要按月对上述数据进行分组,以得到如下输出:
date value_1 value_2
2018.07.01 50 3
2018.08.01 90 6
2018.09.01 80 3
2018.10.01 40 4
How can I do this efficiently in pandas? 如何在熊猫中有效地做到这一点?
Try, groupby using pd.Grouper with freq = 'MS': 尝试使用带有freq ='MS'的pd.Grouper进行分组:
df.groupby(pd.Grouper(freq='MS', key='date')).sum().reset_index()
Output: 输出:
date value_1 value_2
0 2018-07-01 50 3
1 2018-08-01 90 6
2 2018-09-01 80 3
3 2018-10-01 40 4
And, if you want get dot date format back, you can use this: 而且,如果您希望恢复点日期格式,可以使用以下命令:
df_out = df.groupby(pd.Grouper(freq='MS', key='date')).sum().reset_index()
df_out['date'] = df_out['date'].dt.strftime('%Y.%m.%d')
df_out
Output: 输出:
date value_1 value_2
0 2018.07.01 50 3
1 2018.08.01 90 6
2 2018.09.01 80 3
3 2018.10.01 40 4
Do with 与
df.date=pd.to_datetime(df.date)
df.groupby(df.date+pd.offsets.MonthBegin(-1)).sum()
Out[171]:
value_1 value_2
date
2018-07-01 50 3
2018-08-01 90 6
2018-09-01 80 3
2018-10-01 40 4
If you have date as the index, it's as simple as resampling. 如果将日期作为索引,则就像重新采样一样简单。
df.resample('MS').sum()
If you don't have it as the index alreay, you can set_index
. 如果没有它作为索引
set_index
,则可以set_index
。
df.set_index('date').resample('MS').sum()
Both give you 两者都给你
value_1 value_2
date
2018-07-01 50 3
2018-08-01 90 6
2018-09-01 80 3
2018-10-01 40 4
Use the dt accessor to get the months from the date column: 使用dt访问器从日期列获取月份:
df = pd.read_csv(r'C:\Users\Tim\Desktop\data.txt')
df['date'] = pd.to_datetime(df['date'])
df.groupby(df['date'].dt.month).sum()
this will create the following output: 这将创建以下输出:
value_1 value_2
date
7 50 3
8 90 6
9 80 3
10 40 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.