![](/img/trans.png)
[英]Pandas DataFrame how to group (pivot?) rows by values of specified columns, but keeping the original index?
[英]how to calculate the percentage in a group of columns in pandas dataframe while keeping the original format of data
我有一个数据集如下:
date product_category product_type amount
2020-01-01 A 1 15
2020-01-01 A 2 25
2020-01-01 A 3 10
2020-01-02 B 1 15
2020-01-02 B 2 10
2020-01-03 C 2 100
2020-01-03 C 1 250
2020-01-03 C 3 150
我正在尝试根据下面给出的product_category and date
将这些数据转换为标准化数量:
date product_category product_type amount
2020-01-01 A 1 0.30
2020-01-01 A 2 0.50
2020-01-01 A 3 0.20
2020-01-02 B 1 0.60
2020-01-02 B 2 0.40
2020-01-03 C 2 0.20
2020-01-03 C 1 0.50
2020-01-03 C 3 0.30
有什么办法可以处理 python 数据帧并更新原始熊猫 dataframe?
使用带有 sum 的GroupBy.transform
重复聚合sum
,因此可能除以原始列amount
:
#to new column
df['norm'] = df['amount'].div(df.groupby(['date','product_category'])['amount'].transform('sum'))
#rewrite original column
#df['amount'] = df['amount'].div(df.groupby(['date','product_category'])['amount'].transform('sum'))
print (df)
date product_category product_type amount norm
0 2020-01-01 A 1 15 0.3
1 2020-01-01 A 2 25 0.5
2 2020-01-01 A 3 10 0.2
3 2020-01-02 B 1 15 0.6
4 2020-01-02 B 2 10 0.4
5 2020-01-03 C 2 100 0.2
6 2020-01-03 C 1 250 0.5
7 2020-01-03 C 3 150 0.3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.