[英]Pandas DataFrame how to group (pivot?) rows by values of specified columns, but keeping the original index?
[英]how to calculate the percentage in a group of columns in pandas dataframe while keeping the original format of data
我有一個數據集如下:
date product_category product_type amount
2020-01-01 A 1 15
2020-01-01 A 2 25
2020-01-01 A 3 10
2020-01-02 B 1 15
2020-01-02 B 2 10
2020-01-03 C 2 100
2020-01-03 C 1 250
2020-01-03 C 3 150
我正在嘗試根據下面給出的product_category and date
將這些數據轉換為標准化數量:
date product_category product_type amount
2020-01-01 A 1 0.30
2020-01-01 A 2 0.50
2020-01-01 A 3 0.20
2020-01-02 B 1 0.60
2020-01-02 B 2 0.40
2020-01-03 C 2 0.20
2020-01-03 C 1 0.50
2020-01-03 C 3 0.30
有什么辦法可以處理 python 數據幀並更新原始熊貓 dataframe?
使用帶有 sum 的GroupBy.transform
重復聚合sum
,因此可能除以原始列amount
:
#to new column
df['norm'] = df['amount'].div(df.groupby(['date','product_category'])['amount'].transform('sum'))
#rewrite original column
#df['amount'] = df['amount'].div(df.groupby(['date','product_category'])['amount'].transform('sum'))
print (df)
date product_category product_type amount norm
0 2020-01-01 A 1 15 0.3
1 2020-01-01 A 2 25 0.5
2 2020-01-01 A 3 10 0.2
3 2020-01-02 B 1 15 0.6
4 2020-01-02 B 2 10 0.4
5 2020-01-03 C 2 100 0.2
6 2020-01-03 C 1 250 0.5
7 2020-01-03 C 3 150 0.3
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.