How to sum number of occurrences in Pandas based on value in cell

Question

I have the following dataframe with products A, B and C. Each row is unique order made by a customer. I want to create a pivot table which groups by SKU and Source so I can see how many of each product were sold on each Source. The code I am using simply counts the number of rows however ignores the Quantity column. Eg it will tell me that product C has had 1 sale via ebay, but it should be 2.

id  Source  SKU  Quantity
1   Amazon  A    1
2   Amazon  B    1
3   Ebay    C    2
4   Amazon  A    1

The code below is what I am using:

sales = df.groupby(['SKU','Source']).count().reset_index()
sales_by_sku_pivot = sales.pivot(columns='Source',index='SKU',values='Quantity').reset_index()

I know I am missing something that takes into account the values in column Quantity, but I am a bit stumped.

Answer 1

df = pd.read_csv(io.StringIO("""id  Source  SKU  Quantity
1   Amazon  A    1
2   Amazon  B    1
3   Ebay    C    2
4   Amazon  A    1"""), sep="\s\s+", engine="python")

df

# output
    id  Source  SKU Quantity
0   1   Amazon  A   1
1   2   Amazon  B   1
2   3   Ebay    C   2
3   4   Amazon  A   1

df.pivot_table(index="SKU", columns="Source", values="Quantity", aggfunc="sum").fillna(0)

# output
Source  Amazon  Ebay
SKU     
A       2.0     0.0
B       1.0     0.0
C       0.0     2.0

How to sum number of occurrences in Pandas based on value in cell

Question

1 answers

solution1
0 2021-10-28 17:43:00

How to sum number of occurrences in Pandas based on value in cell

Question

1 answers

solution1 0 2021-10-28 17:43:00

solution1
0 2021-10-28 17:43:00