简体   繁体   中英

pivot table in pandas with multiple columns

I have following dataframe in pandas

  date        prod    hourly_bucket      tank      trans      flag     
  01-01-2019  TP      05:00:00-06:00:00  2         Preset     Peak
  01-01-2019  TP      05:00:00-06:00:00  2         Preset     Peak
  01-01-2019  TP      05:00:00-06:00:00  2         Non Preset Peak
  02-01-2019  TP      05:00:00-06:00:00  2         Preset     Lean
  02-01-2019  TP      05:00:00-06:00:00  2         Preset     Lean
  02-01-2019  TP      05:00:00-06:00:00  2         Non Preset Lean

My Desired dataframe would be aggregation at day level and tank level and then taking a count of how many Preset,Non-Preset transactions in Lean and Peak hours

  date       tank   Lean_Non_Preset  Lean_Preset  Peak_Non_Preset  Peak_Preset
  01-01-2019 2      1                2            1                2

I am doing following in pandas

 lean_peak_preset_cnt = df.pivot_table(index=['date','tank'], columns=['flag'],values=['trans'],aggfunc='count').reset_index()  

But it does not give me the required solution

Add 'trans' to parameter columns and then flatten MultiIndex in columns with map and join :

lean_peak_preset_cnt = df.pivot_table(index=['date','tank'], 
                                      columns=['flag','trans'],
                                      aggfunc='size', 
                                      fill_value=0) 

lean_peak_preset_cnt.columns = lean_peak_preset_cnt.columns.map('_'.join)
lean_peak_preset_cnt = lean_peak_preset_cnt.reset_index() 
print (lean_peak_preset_cnt)

         date  tank  Lean_No Preset  Lean_Preset  Peak_Non Preset  Peak_Preset
0  01-01-2019     2               0            0                1            2
1  02-01-2019     2               1            2                0            0

You were almost there:

piv = (df.pivot_table(index=['date', 'tank'], columns=['trans', 'flag'], 
                      aggfunc='size', fill_value=0))
piv.columns = piv.columns.ravel()

The size function gives the counts you want, you would want to fill non-counted values with 0, and specify the columns and index you want. See docs for more details. The ravel combines your multiindex columns to one level.

                 (Nonpreset, Lean)  (Nonpreset, Peak)  (Preset, Lean)  \
#date       tank                                                         
#01-01-2019 2                     0                  1               0   
#02-01-2019 2                     1                  0               2   

                 (Preset, Peak)  
#date       tank                  
#01-01-2019 2                  2  
#02-01-2019 2                  0 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM