Python pivot_table - 添加差异列

Question

I am new to python.我是 python 的新手。 I have the following data frame.我有以下数据框。 I am able to pivot in Excel.我能够在 Excel 中使用 pivot。

I want to add the difference column(in the image, I added it manually).我想添加差异列（在图像中，我手动添加了它）。

The difference is BA value.差值是 BA 值。 I am able to replicate except difference column and Grand Total using Python pivot table.我可以使用 Python pivot 表复制除差异列和总计之外的数据。 Below is my code.下面是我的代码。

table = pd.pivot_table(data, index=['Category'], values = ['value'], columns=['Name','Date'], fill_value=0)

How can I add the difference column and calculate the value?如何添加差异列并计算值？

How can I get Grand Total at the bottom?我怎样才能在底部获得总计？

Data as below数据如下

df = pd.DataFrame({
"Value": [0.1, 0.2, 3, 1, -.5, 4],
"Date": ["2020-07-01", "2020-07-01", "2020-07-01", "2020-07-01", "2020-07-01", "2020-07-01"],
"Name": ['A', 'A', 'A', 'B', 'B', 'B'],
"HI Display1": ["X", "Y", "Z", "Z", "Y", "X"]})

I want to the pivot table as below我想要 pivot 表如下

Answer 1

Here's a way to do that:这是一种方法：

df = pd.DataFrame({
    "Name": ["A", "A", "A", "B", "B", "B"], 
    "Date": "2020-07-01", 
    "Value": [0.1, 0.2, 3, 2, -.5, 4], 
    "Category": ["Z", "Y", "X", "Z", "Y", "X"]
})

piv = pd.pivot_table(df, index="Category", columns="Name", aggfunc=sum)
piv.columns = [c[1] for c in piv.columns]
piv["diff"] = piv.B - piv.A

The output ( piv ) is: output ( piv ) 是：

            A    B  diff
Category                
X         3.0  4.0   1.0
Y         0.2 -0.5  -0.7
Z         0.1  2.0   1.9

To add 'total' for A and B, do要为 A 和 B 添加“总计”，请执行

piv.loc["total"] = piv.sum()

Remove the total from the 'diff' column:从“差异”列中删除总计：

piv.loc["total", "diff"] = "" # or np.NaN, if you'd like to be more 
                              # 'pandas' style.

The output now is: output 现在是：

            A    B  diff
Category                
X         3.0  4.0   1.0
Y         0.2 -0.5  -0.7
Z         0.1  2.0   1.9
total     3.3  5.5

If, at this point, you'd like to add the title 'Name' on top of the categories, do:如果此时您想在类别顶部添加标题“名称”，请执行以下操作：

piv.columns = pd.MultiIndex.from_product([["Name"], piv.columns])

piv is now: piv现在是：

         Name          
            A    B diff
Category               
X         3.0  4.0  1.0
Y         0.2 -0.5 -0.7
Z         0.1  2.0  1.9
total     3.3  5.5

To add the date to each column:要将日期添加到每一列：

date = df.Date.max()
piv.columns = pd.MultiIndex.from_tuples([c+(date,) for c in piv.columns])

==>
               Name                      
                  A          B       diff
         2020-07-01 2020-07-01 2020-07-01
Category                                 
X               3.0        4.0          1
Y               0.2       -0.5       -0.7
Z               0.1        2.0        1.9
total           3.3        5.5

Finally, to color a column (eg if you're using Jupyter), do:最后，为列着色（例如，如果您使用 Jupyter），请执行以下操作：

second_col = piv.columns[2]
piv.style.background_gradient("PiYG", subset = [second_col]).highlight_null('white').set_na_rep("")

Answer 2

Other way to add totals is adding ´margins=True´ argument to pivot function and then replace Total column with difference as this:添加总计的其他方法是将“margins=True”参数添加到 pivot function 然后用差异替换 Total 列，如下所示：

data = {
        'Name':['A', 'A' ,'A', 'B', 'B', 'B','A', 'A' ,'A', 'B', 'B', 'B' ],
        'Value':[1, 2, 3, 4, 5, 6,1, 2, 3, 4, 5, 6, ],
        'Category': ['X', 'Y', 'Z','X', 'Y', 'Z','X', 'Y', 'Z','X', 'Y', 'Z']
    }

df = pd.DataFrame(data)

pivot_ = df.pivot_table(index = ["Category"], 
              columns = "Name" , 
              values = "Value", 
              aggfunc = "sum", 
              margins=True, 
              margins_name='Totals')\
 .fillna('')

pivot_['Totals'] = pivot_['B'] - pivot_['A']

pivot_.rename(columns={"Totals": "Diff"})

Output: Output：

Name    A   B   Diff
Category            
X       2   8   6
Y       4   10  6
Z       6   12  6
Totals  12  30  18

EDIT BASED ON QUESTION UPDATE:根据问题更新进行编辑：

Let's use the sample data you now provided:让我们使用您现在提供的示例数据：

pivot_1 = df_1.pivot_table(index = ["HI Display1"], 
              columns = ["Name", 'Date'], 
              values = "Value", 
              aggfunc = "sum", 
              margins=True, 
              margins_name='Totals'
).fillna('')

pivot_1['Totals'] = pivot_1['B'].sum(axis=1) - pivot_1['A'].sum(axis=1)

pivot_1.rename(columns={"Totals": "Diff"})

Output: Output：

Name        A           B           Diff
Date        2020-07-01  2020-07-01  
HI Display1         
X           0.1         4.0         3.9
Y           0.2         -0.5        -0.7
Z           3.0         1.0         -2.0
Totals      3.3         4.5         1.2

Python pivot_table - 添加差异列

问题描述

2 个解决方案

解决方案1
4 已采纳 2020-07-04 15:27:27

解决方案2
2 2020-07-04 16:06:33

EDIT BASED ON QUESTION UPDATE:根据问题更新进行编辑：

Python pivot_table - 添加差异列

问题描述

2 个解决方案

解决方案1 4 已采纳 2020-07-04 15:27:27

解决方案2 2 2020-07-04 16:06:33

EDIT BASED ON QUESTION UPDATE:根据问题更新进行编辑：

解决方案1
4 已采纳 2020-07-04 15:27:27

解决方案2
2 2020-07-04 16:06:33