如何使用 pandas 向 pivot 表添加新列

Question

I'm trying to create a new column that show the weightage of every product that I has.我正在尝试创建一个新列来显示我拥有的每种产品的权重。

Let's say I have the following dataframe that I have pivot:假设我有以下 dataframe 我有 pivot：

   PRODUCT  UNIT_TESTED AVG_YIELD 
        A       401    82.1042
        B      1512    96.0687  
        C       292    22.7806  
        D       134    37.0088

using使用

 pd.pivot_table(data = df, index = ['PRODUCT'], 
                  values = ("UNIT_TESTED","AVG_YIELD"), 
                  aggfunc = "sum", margins=True)\
     .fillna('')

Now, I want to add a new column WEIGHTAGE for each product.现在，我想为每个产品添加一个新列WEIGHTAGE 。

The calculation:计算：

WEIGHTAGE 'A' = (UNIT_TESTED 'A'/Total of UNIT_TESTED)*100 WEIGHTAGE 'A' = (UNIT_TESTED 'A'/UNIT_TESTED 的总数)*100

This is where I'm stuck to put into coding to create a new column.这是我坚持编码以创建新列的地方。

My desired output:我想要的 output：

PRODUCT UNIT_TESTED AVG_YIELD WEIGHTAGE
    A       401      82.1042    17.1441
    B      1512      96.0687    64.6430
    C       292      22.7806    12.4840
    D       134      37.0088    5.7289

Answer 1

The last row of the pivot table that you obtained contains the sum of unit tested.您获得的 pivot 表的最后一行包含单元测试的总和。 So you can simply divide by that value ( pivot_df.loc["All","UNIT_TESTED"] ) the column UNIT_TESTED :因此，您可以简单地将 UNIT_TESTED 列除以该值（ pivot_df.loc["All","UNIT_TESTED"] UNIT_TESTED ：

pivot_df = pd.pivot_table(data = df, index = ['PRODUCT'], 
                  values = ("UNIT_TESTED","AVG_YIELD"), 
                  aggfunc = "sum", margins=True)\
     .fillna('')

pivot_df["Weightage"] = round((pivot_df["UNIT_TESTED"] / pivot_df.loc["All","UNIT_TESTED"])*100,2)

print(pivot_df)

Output: Output：

    AVG_YIELD   UNIT_TESTED Weightage
PRODUCT         
A   82.1042     401        17.14
B   96.0687     1512       64.64
C   22.7806     292        12.48
D   37.0080     134        5.73
All 237.9615    2339       100.00

Answer 2

suppose, your pivot table is pivot_df假设，您的 pivot 表是 pivot_df

pivot_df['WEIGHTAGE'] = (pivot_df['UNIT_TESTED'] * 100 ) / pivot_df['UNIT_TESTED'].sum()

如何使用 pandas 向 pivot 表添加新列

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-06-18 10:01:03

解决方案2
0 2020-06-18 10:00:07

如何使用 pandas 向 pivot 表添加新列

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-06-18 10:01:03

解决方案2 0 2020-06-18 10:00:07

解决方案1
1 已采纳 2020-06-18 10:01:03

解决方案2
0 2020-06-18 10:00:07