[英]How to add new columns to pivot table using pandas
I'm trying to create a new column that show the weightage of every product that I has.我正在尝试创建一个新列来显示我拥有的每种产品的权重。
Let's say I have the following dataframe that I have pivot:假设我有以下 dataframe 我有 pivot:
PRODUCT UNIT_TESTED AVG_YIELD
A 401 82.1042
B 1512 96.0687
C 292 22.7806
D 134 37.0088
using使用
pd.pivot_table(data = df, index = ['PRODUCT'],
values = ("UNIT_TESTED","AVG_YIELD"),
aggfunc = "sum", margins=True)\
.fillna('')
Now, I want to add a new column WEIGHTAGE
for each product.现在,我想为每个产品添加一个新列WEIGHTAGE
。
The calculation:计算:
WEIGHTAGE 'A' = (UNIT_TESTED 'A'/Total of UNIT_TESTED)*100 WEIGHTAGE 'A' = (UNIT_TESTED 'A'/UNIT_TESTED 的总数)*100
This is where I'm stuck to put into coding to create a new column.这是我坚持编码以创建新列的地方。
My desired output:我想要的 output:
PRODUCT UNIT_TESTED AVG_YIELD WEIGHTAGE
A 401 82.1042 17.1441
B 1512 96.0687 64.6430
C 292 22.7806 12.4840
D 134 37.0088 5.7289
The last row of the pivot table that you obtained contains the sum of unit tested.您获得的 pivot 表的最后一行包含单元测试的总和。 So you can simply divide by that value ( pivot_df.loc["All","UNIT_TESTED"]
) the column UNIT_TESTED
:因此,您可以简单地将 UNIT_TESTED 列除以该值( pivot_df.loc["All","UNIT_TESTED"]
UNIT_TESTED
:
pivot_df = pd.pivot_table(data = df, index = ['PRODUCT'],
values = ("UNIT_TESTED","AVG_YIELD"),
aggfunc = "sum", margins=True)\
.fillna('')
pivot_df["Weightage"] = round((pivot_df["UNIT_TESTED"] / pivot_df.loc["All","UNIT_TESTED"])*100,2)
print(pivot_df)
Output: Output:
AVG_YIELD UNIT_TESTED Weightage
PRODUCT
A 82.1042 401 17.14
B 96.0687 1512 64.64
C 22.7806 292 12.48
D 37.0080 134 5.73
All 237.9615 2339 100.00
suppose, your pivot table is pivot_df假设,您的 pivot 表是 pivot_df
pivot_df['WEIGHTAGE'] = (pivot_df['UNIT_TESTED'] * 100 ) / pivot_df['UNIT_TESTED'].sum()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.