简体   繁体   English

在 Pandas pivot 表的列中格式化时间格式

[英]Format time format in columns for Pandas pivot table

I am building a pivot table with Pandas from json data.我正在构建一个 pivot 表,其中 Pandas 来自 json 数据。 I'd like to format the column names before sending to to_string().我想在发送到 to_string() 之前格式化列名。

import numpy as np
import pandas as pd

json_data = [
 {"year":2019,"month":"2019-11-01","sub_category":"Van fit out","category":"Vehicle","notes":"Heavy duty hooks","gross_amount":8.96}, 
 {"year":2019,"month":"2019-11-01","sub_category":"Fuel & oil","category":"Vehicle","notes":"Fuel","gross_amount":20.00},  
# more data  [...]
 {"year":2020,"month":"2020-02-01","sub_category":"Gutter Vac","category":"WC Equipment + H&S","notes":"Tape Measure + Bungi Cord + Plastic Membrane + Extension reel + Microfibre cloths + Waterproof Jacket","gross_amount":97.94}, 
 {"year":2020,"month":"2020-02-01","sub_category":"Trad equipment","category":"WC Materials","notes":"Spray Bottle + Microfibres","gross_amount":4.47}, 
 ]            

data = pd.DataFrame(json_data)

# Pivot the data:
pivot = pd.pivot_table(
            data, values=['gross_amount'], index=['category', 'sub_category'],
                    columns=['year', 'month'], aggfunc=np.sum, fill_value=0, dropna=True, margins=True)
# Add total rows for index level 0:
pivot = pd.concat([
        d.append(d.sum(skipna=True).rename((k, 'Total')))
        for k, d in pivot.groupby(level=0)
        ])

# Render to string:
string = pivot.to_string()

print(string)

The result is结果是

                                  gross_amount
year                                      2019       2020     All
month                               2019-11-01 2020-02-01
category           sub_category
All                                      28.96     102.41  131.37
                   Total                 28.96     102.41  131.37
Vehicle            Fuel & oil            20.00       0.00   20.00
                   Van fit out            8.96       0.00    8.96
                   Total                 28.96       0.00   28.96
WC Equipment + H&S Gutter Vac             0.00      97.94   97.94
                   Total                  0.00      97.94   97.94
WC Materials       Trad equipment         0.00       4.47    4.47
                   Total                  0.00       4.47    4.47

How can I get the months to be formatted differently (in my case I need the month name)?我怎样才能让月份的格式不同(在我的情况下,我需要月份名称)? I have changed the month to a string in the dataframe before pivoting but then I lose the correct order.在旋转之前,我已将月份更改为 dataframe 中的字符串,但随后我丢失了正确的顺序。

Thanks谢谢

that seems to work这似乎有效

# first get the column as date
data["monthdate"] = pd.to_datetime(data["month"])
# then format a column with the name
data['monthname'] = data["monthdate"].dt.strftime("%B")

# Pivot the data:
pivot = pd.pivot_table(
          data, 
          values=['gross_amount'], 
          index=['category', 'sub_category'],
          columns=['year', 'monthname'], 
          aggfunc=np.sum, 
          fill_value=0, 
          dropna=True, 
margins=True)

# Add total rows for index level 0:
pivot = pd.concat([
    d.append(d.sum(skipna=True).rename((k, 'Total')))
    for k, d in pivot.groupby(level=0)
    ])

# Render to string:
string = pivot.to_string()

print(string)

It doesn't look like this is possible with pivot_table(), but I managed it with group_by() instead, which accepts a sort argument :使用 pivot_table() 似乎不可能,但我使用 group_by() 来管理它,它接受一个sort 参数

pivot = data.groupby(['category', 'sub_category', 'year', 'monthname'], sort=False)['gross_amount'].sum().unstack(['year', 'monthname'])

The sort=False stops it from sorting alphabetically, and preserves the original order the monts appeared, so the dataframe has to be sorted prior to grouping. sort=False阻止它按字母顺序排序,并保留 monts 出现的原始顺序,因此 dataframe 必须在分组之前进行排序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM