简体   繁体   中英

Format time format in columns for Pandas pivot table

I am building a pivot table with Pandas from json data. I'd like to format the column names before sending to to_string().

import numpy as np
import pandas as pd

json_data = [
 {"year":2019,"month":"2019-11-01","sub_category":"Van fit out","category":"Vehicle","notes":"Heavy duty hooks","gross_amount":8.96}, 
 {"year":2019,"month":"2019-11-01","sub_category":"Fuel & oil","category":"Vehicle","notes":"Fuel","gross_amount":20.00},  
# more data  [...]
 {"year":2020,"month":"2020-02-01","sub_category":"Gutter Vac","category":"WC Equipment + H&S","notes":"Tape Measure + Bungi Cord + Plastic Membrane + Extension reel + Microfibre cloths + Waterproof Jacket","gross_amount":97.94}, 
 {"year":2020,"month":"2020-02-01","sub_category":"Trad equipment","category":"WC Materials","notes":"Spray Bottle + Microfibres","gross_amount":4.47}, 
 ]            

data = pd.DataFrame(json_data)

# Pivot the data:
pivot = pd.pivot_table(
            data, values=['gross_amount'], index=['category', 'sub_category'],
                    columns=['year', 'month'], aggfunc=np.sum, fill_value=0, dropna=True, margins=True)
# Add total rows for index level 0:
pivot = pd.concat([
        d.append(d.sum(skipna=True).rename((k, 'Total')))
        for k, d in pivot.groupby(level=0)
        ])

# Render to string:
string = pivot.to_string()

print(string)

The result is

                                  gross_amount
year                                      2019       2020     All
month                               2019-11-01 2020-02-01
category           sub_category
All                                      28.96     102.41  131.37
                   Total                 28.96     102.41  131.37
Vehicle            Fuel & oil            20.00       0.00   20.00
                   Van fit out            8.96       0.00    8.96
                   Total                 28.96       0.00   28.96
WC Equipment + H&S Gutter Vac             0.00      97.94   97.94
                   Total                  0.00      97.94   97.94
WC Materials       Trad equipment         0.00       4.47    4.47
                   Total                  0.00       4.47    4.47

How can I get the months to be formatted differently (in my case I need the month name)? I have changed the month to a string in the dataframe before pivoting but then I lose the correct order.

Thanks

that seems to work

# first get the column as date
data["monthdate"] = pd.to_datetime(data["month"])
# then format a column with the name
data['monthname'] = data["monthdate"].dt.strftime("%B")

# Pivot the data:
pivot = pd.pivot_table(
          data, 
          values=['gross_amount'], 
          index=['category', 'sub_category'],
          columns=['year', 'monthname'], 
          aggfunc=np.sum, 
          fill_value=0, 
          dropna=True, 
margins=True)

# Add total rows for index level 0:
pivot = pd.concat([
    d.append(d.sum(skipna=True).rename((k, 'Total')))
    for k, d in pivot.groupby(level=0)
    ])

# Render to string:
string = pivot.to_string()

print(string)

It doesn't look like this is possible with pivot_table(), but I managed it with group_by() instead, which accepts a sort argument :

pivot = data.groupby(['category', 'sub_category', 'year', 'monthname'], sort=False)['gross_amount'].sum().unstack(['year', 'monthname'])

The sort=False stops it from sorting alphabetically, and preserves the original order the monts appeared, so the dataframe has to be sorted prior to grouping.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM