简体   繁体   中英

How do I use timestamp dates as an index in a pivot table in python?

I have recently began working with the python.pivot_table and have encountered a challenge using timestamps properly with the pivot tables.

I have a large dataframe with data like the below

         Date          ID           Days  Quantity      Concern
0  2012-06-29         NaN            621       NaN            A
1  2012-06-29     1208985            874         1            A
2  2012-06-29         NaN            621         2            B
3  2012-06-29         NaN            874         1            C
4  2012-06-29         NaN            566       NaN            A
5  2012-06-29      251254            780       NaN            A
6  2012-06-29         NaN            566       NaN            C
7  2012-06-29      385379            566         1            B
8  2012-06-29      967911            780         1            B
9  2012-06-29         NaN            521       NaN            A
10 2012-06-29     1208985            834         1            C
11 2012-06-29      385379            374       NaN            A
12 2012-06-29      967909            780         1            B
13 2012-07-18         NaN            821       NaN            A
14 2012-07-18      251254            821       NaN            A
15 2012-08-04      756444            676         1            C
16 2012-08-04      756444            676         2            C
17 2012-08-04         NaN            676       NaN            A
18 2012-08-24         NaN            571       NaN            B
19 2012-08-24      251254            446         1            B

A line like the below works great:

pd.pivot_table(data,index=['Concern'],columns=['ID'],values=['Quantity'],aggfunc='sum')

Currently when I use the Date column for index=['Date'] it groups by the day. I would like to option of being able to group by month or year. Is there a way to implement this with pivot tables when the date column are TimeStamp objects?

You can access information like year and month through the .dt attribute that datetime series have, so you can easily make new columns like:

df['Month'] = df['Date'].dt.month

Then use those columns to create the pivot table:

pd.pivot_table(df, index=['Month'], columns=['ID'],
               values=['Quantity'],aggfunc='sum')

Output:

Out[16]: 
      Quantity                                        
ID     251254  385379  756444  967909  967911  1208985
Month                                                 
6          NaN       1     NaN       1       1       2
7          NaN     NaN     NaN     NaN     NaN     NaN
8            1     NaN       3     NaN     NaN     NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM