简体   繁体   English

如何在Python数据透视表中使用时间戳记日期作为索引?

[英]How do I use timestamp dates as an index in a pivot table in python?

I have recently began working with the python.pivot_table and have encountered a challenge using timestamps properly with the pivot tables. 我最近开始使用python.pivot_table,并且遇到了在数据透视表中正确使用时间戳的挑战。

I have a large dataframe with data like the below 我有一个大型数据框,其中包含如下数据

         Date          ID           Days  Quantity      Concern
0  2012-06-29         NaN            621       NaN            A
1  2012-06-29     1208985            874         1            A
2  2012-06-29         NaN            621         2            B
3  2012-06-29         NaN            874         1            C
4  2012-06-29         NaN            566       NaN            A
5  2012-06-29      251254            780       NaN            A
6  2012-06-29         NaN            566       NaN            C
7  2012-06-29      385379            566         1            B
8  2012-06-29      967911            780         1            B
9  2012-06-29         NaN            521       NaN            A
10 2012-06-29     1208985            834         1            C
11 2012-06-29      385379            374       NaN            A
12 2012-06-29      967909            780         1            B
13 2012-07-18         NaN            821       NaN            A
14 2012-07-18      251254            821       NaN            A
15 2012-08-04      756444            676         1            C
16 2012-08-04      756444            676         2            C
17 2012-08-04         NaN            676       NaN            A
18 2012-08-24         NaN            571       NaN            B
19 2012-08-24      251254            446         1            B

A line like the below works great: 像下面这样的行效果很好:

pd.pivot_table(data,index=['Concern'],columns=['ID'],values=['Quantity'],aggfunc='sum')

Currently when I use the Date column for index=['Date'] it groups by the day. 当前,当我使用Date列作为index=['Date']它按天分组。 I would like to option of being able to group by month or year. 我希望可以按月或年分组。 Is there a way to implement this with pivot tables when the date column are TimeStamp objects? 当date列是TimeStamp对象时,是否可以用数据透视表实现此目的?

You can access information like year and month through the .dt attribute that datetime series have, so you can easily make new columns like: 您可以通过datetime系列具有的.dt属性访问诸如年和月之类的信息,因此您可以轻松地创建新列,例如:

df['Month'] = df['Date'].dt.month

Then use those columns to create the pivot table: 然后使用这些列创建数据透视表:

pd.pivot_table(df, index=['Month'], columns=['ID'],
               values=['Quantity'],aggfunc='sum')

Output: 输出:

Out[16]: 
      Quantity                                        
ID     251254  385379  756444  967909  967911  1208985
Month                                                 
6          NaN       1     NaN       1       1       2
7          NaN     NaN     NaN     NaN     NaN     NaN
8            1     NaN       3     NaN     NaN     NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM