[英]Format time format in columns for Pandas pivot table
I am building a pivot table with Pandas from json data.我正在构建一个 pivot 表,其中 Pandas 来自 json 数据。 I'd like to format the column names before sending to to_string().
我想在发送到 to_string() 之前格式化列名。
import numpy as np
import pandas as pd
json_data = [
{"year":2019,"month":"2019-11-01","sub_category":"Van fit out","category":"Vehicle","notes":"Heavy duty hooks","gross_amount":8.96},
{"year":2019,"month":"2019-11-01","sub_category":"Fuel & oil","category":"Vehicle","notes":"Fuel","gross_amount":20.00},
# more data [...]
{"year":2020,"month":"2020-02-01","sub_category":"Gutter Vac","category":"WC Equipment + H&S","notes":"Tape Measure + Bungi Cord + Plastic Membrane + Extension reel + Microfibre cloths + Waterproof Jacket","gross_amount":97.94},
{"year":2020,"month":"2020-02-01","sub_category":"Trad equipment","category":"WC Materials","notes":"Spray Bottle + Microfibres","gross_amount":4.47},
]
data = pd.DataFrame(json_data)
# Pivot the data:
pivot = pd.pivot_table(
data, values=['gross_amount'], index=['category', 'sub_category'],
columns=['year', 'month'], aggfunc=np.sum, fill_value=0, dropna=True, margins=True)
# Add total rows for index level 0:
pivot = pd.concat([
d.append(d.sum(skipna=True).rename((k, 'Total')))
for k, d in pivot.groupby(level=0)
])
# Render to string:
string = pivot.to_string()
print(string)
The result is结果是
gross_amount
year 2019 2020 All
month 2019-11-01 2020-02-01
category sub_category
All 28.96 102.41 131.37
Total 28.96 102.41 131.37
Vehicle Fuel & oil 20.00 0.00 20.00
Van fit out 8.96 0.00 8.96
Total 28.96 0.00 28.96
WC Equipment + H&S Gutter Vac 0.00 97.94 97.94
Total 0.00 97.94 97.94
WC Materials Trad equipment 0.00 4.47 4.47
Total 0.00 4.47 4.47
How can I get the months to be formatted differently (in my case I need the month name)?我怎样才能让月份的格式不同(在我的情况下,我需要月份名称)? I have changed the month to a string in the dataframe before pivoting but then I lose the correct order.
在旋转之前,我已将月份更改为 dataframe 中的字符串,但随后我丢失了正确的顺序。
Thanks谢谢
that seems to work这似乎有效
# first get the column as date
data["monthdate"] = pd.to_datetime(data["month"])
# then format a column with the name
data['monthname'] = data["monthdate"].dt.strftime("%B")
# Pivot the data:
pivot = pd.pivot_table(
data,
values=['gross_amount'],
index=['category', 'sub_category'],
columns=['year', 'monthname'],
aggfunc=np.sum,
fill_value=0,
dropna=True,
margins=True)
# Add total rows for index level 0:
pivot = pd.concat([
d.append(d.sum(skipna=True).rename((k, 'Total')))
for k, d in pivot.groupby(level=0)
])
# Render to string:
string = pivot.to_string()
print(string)
It doesn't look like this is possible with pivot_table(), but I managed it with group_by() instead, which accepts a sort
argument :使用 pivot_table() 似乎不可能,但我使用 group_by() 来管理它,它接受一个
sort
参数:
pivot = data.groupby(['category', 'sub_category', 'year', 'monthname'], sort=False)['gross_amount'].sum().unstack(['year', 'monthname'])
The sort=False
stops it from sorting alphabetically, and preserves the original order the monts appeared, so the dataframe has to be sorted prior to grouping. sort=False
阻止它按字母顺序排序,并保留 monts 出现的原始顺序,因此 dataframe 必须在分组之前进行排序。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.