I have the following dataframe with products A, B and C. Each row is unique order made by a customer. I want to create a pivot table which groups by SKU and Source so I can see how many of each product were sold on each Source. The code I am using simply counts the number of rows however ignores the Quantity column. Eg it will tell me that product C has had 1 sale via ebay, but it should be 2.
id Source SKU Quantity
1 Amazon A 1
2 Amazon B 1
3 Ebay C 2
4 Amazon A 1
The code below is what I am using:
sales = df.groupby(['SKU','Source']).count().reset_index()
sales_by_sku_pivot = sales.pivot(columns='Source',index='SKU',values='Quantity').reset_index()
I know I am missing something that takes into account the values in column Quantity, but I am a bit stumped.
df = pd.read_csv(io.StringIO("""id Source SKU Quantity
1 Amazon A 1
2 Amazon B 1
3 Ebay C 2
4 Amazon A 1"""), sep="\s\s+", engine="python")
df
# output
id Source SKU Quantity
0 1 Amazon A 1
1 2 Amazon B 1
2 3 Ebay C 2
3 4 Amazon A 1
df.pivot_table(index="SKU", columns="Source", values="Quantity", aggfunc="sum").fillna(0)
# output
Source Amazon Ebay
SKU
A 2.0 0.0
B 1.0 0.0
C 0.0 2.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.