[英]How do I use timestamp dates as an index in a pivot table in python?
I have recently began working with the python.pivot_table and have encountered a challenge using timestamps properly with the pivot tables. 我最近开始使用python.pivot_table,并且遇到了在数据透视表中正确使用时间戳的挑战。
I have a large dataframe with data like the below 我有一个大型数据框,其中包含如下数据
Date ID Days Quantity Concern
0 2012-06-29 NaN 621 NaN A
1 2012-06-29 1208985 874 1 A
2 2012-06-29 NaN 621 2 B
3 2012-06-29 NaN 874 1 C
4 2012-06-29 NaN 566 NaN A
5 2012-06-29 251254 780 NaN A
6 2012-06-29 NaN 566 NaN C
7 2012-06-29 385379 566 1 B
8 2012-06-29 967911 780 1 B
9 2012-06-29 NaN 521 NaN A
10 2012-06-29 1208985 834 1 C
11 2012-06-29 385379 374 NaN A
12 2012-06-29 967909 780 1 B
13 2012-07-18 NaN 821 NaN A
14 2012-07-18 251254 821 NaN A
15 2012-08-04 756444 676 1 C
16 2012-08-04 756444 676 2 C
17 2012-08-04 NaN 676 NaN A
18 2012-08-24 NaN 571 NaN B
19 2012-08-24 251254 446 1 B
A line like the below works great: 像下面这样的行效果很好:
pd.pivot_table(data,index=['Concern'],columns=['ID'],values=['Quantity'],aggfunc='sum')
Currently when I use the Date column for index=['Date']
it groups by the day. 当前,当我使用Date列作为index=['Date']
它按天分组。 I would like to option of being able to group by month or year. 我希望可以按月或年分组。 Is there a way to implement this with pivot tables when the date column are TimeStamp objects? 当date列是TimeStamp对象时,是否可以用数据透视表实现此目的?
You can access information like year and month through the .dt
attribute that datetime series have, so you can easily make new columns like: 您可以通过datetime系列具有的.dt
属性访问诸如年和月之类的信息,因此您可以轻松地创建新列,例如:
df['Month'] = df['Date'].dt.month
Then use those columns to create the pivot table: 然后使用这些列创建数据透视表:
pd.pivot_table(df, index=['Month'], columns=['ID'],
values=['Quantity'],aggfunc='sum')
Output: 输出:
Out[16]:
Quantity
ID 251254 385379 756444 967909 967911 1208985
Month
6 NaN 1 NaN 1 1 2
7 NaN NaN NaN NaN NaN NaN
8 1 NaN 3 NaN NaN NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.