I am building a pivot table with Pandas from json data. I'd like to format the column names before sending to to_string().
import numpy as np
import pandas as pd
json_data = [
{"year":2019,"month":"2019-11-01","sub_category":"Van fit out","category":"Vehicle","notes":"Heavy duty hooks","gross_amount":8.96},
{"year":2019,"month":"2019-11-01","sub_category":"Fuel & oil","category":"Vehicle","notes":"Fuel","gross_amount":20.00},
# more data [...]
{"year":2020,"month":"2020-02-01","sub_category":"Gutter Vac","category":"WC Equipment + H&S","notes":"Tape Measure + Bungi Cord + Plastic Membrane + Extension reel + Microfibre cloths + Waterproof Jacket","gross_amount":97.94},
{"year":2020,"month":"2020-02-01","sub_category":"Trad equipment","category":"WC Materials","notes":"Spray Bottle + Microfibres","gross_amount":4.47},
]
data = pd.DataFrame(json_data)
# Pivot the data:
pivot = pd.pivot_table(
data, values=['gross_amount'], index=['category', 'sub_category'],
columns=['year', 'month'], aggfunc=np.sum, fill_value=0, dropna=True, margins=True)
# Add total rows for index level 0:
pivot = pd.concat([
d.append(d.sum(skipna=True).rename((k, 'Total')))
for k, d in pivot.groupby(level=0)
])
# Render to string:
string = pivot.to_string()
print(string)
The result is
gross_amount
year 2019 2020 All
month 2019-11-01 2020-02-01
category sub_category
All 28.96 102.41 131.37
Total 28.96 102.41 131.37
Vehicle Fuel & oil 20.00 0.00 20.00
Van fit out 8.96 0.00 8.96
Total 28.96 0.00 28.96
WC Equipment + H&S Gutter Vac 0.00 97.94 97.94
Total 0.00 97.94 97.94
WC Materials Trad equipment 0.00 4.47 4.47
Total 0.00 4.47 4.47
How can I get the months to be formatted differently (in my case I need the month name)? I have changed the month to a string in the dataframe before pivoting but then I lose the correct order.
Thanks
that seems to work
# first get the column as date
data["monthdate"] = pd.to_datetime(data["month"])
# then format a column with the name
data['monthname'] = data["monthdate"].dt.strftime("%B")
# Pivot the data:
pivot = pd.pivot_table(
data,
values=['gross_amount'],
index=['category', 'sub_category'],
columns=['year', 'monthname'],
aggfunc=np.sum,
fill_value=0,
dropna=True,
margins=True)
# Add total rows for index level 0:
pivot = pd.concat([
d.append(d.sum(skipna=True).rename((k, 'Total')))
for k, d in pivot.groupby(level=0)
])
# Render to string:
string = pivot.to_string()
print(string)
It doesn't look like this is possible with pivot_table(), but I managed it with group_by() instead, which accepts a sort
argument :
pivot = data.groupby(['category', 'sub_category', 'year', 'monthname'], sort=False)['gross_amount'].sum().unstack(['year', 'monthname'])
The sort=False
stops it from sorting alphabetically, and preserves the original order the monts appeared, so the dataframe has to be sorted prior to grouping.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.