简体   繁体   中英

How to sum number of occurrences in Pandas based on value in cell

I have the following dataframe with products A, B and C. Each row is unique order made by a customer. I want to create a pivot table which groups by SKU and Source so I can see how many of each product were sold on each Source. The code I am using simply counts the number of rows however ignores the Quantity column. Eg it will tell me that product C has had 1 sale via ebay, but it should be 2.

id  Source  SKU  Quantity
1   Amazon  A    1
2   Amazon  B    1
3   Ebay    C    2
4   Amazon  A    1

The code below is what I am using:

sales = df.groupby(['SKU','Source']).count().reset_index()
sales_by_sku_pivot = sales.pivot(columns='Source',index='SKU',values='Quantity').reset_index()

I know I am missing something that takes into account the values in column Quantity, but I am a bit stumped.

df = pd.read_csv(io.StringIO("""id  Source  SKU  Quantity
1   Amazon  A    1
2   Amazon  B    1
3   Ebay    C    2
4   Amazon  A    1"""), sep="\s\s+", engine="python")

df

# output
    id  Source  SKU Quantity
0   1   Amazon  A   1
1   2   Amazon  B   1
2   3   Ebay    C   2
3   4   Amazon  A   1
df.pivot_table(index="SKU", columns="Source", values="Quantity", aggfunc="sum").fillna(0)

# output
Source  Amazon  Ebay
SKU     
A       2.0     0.0
B       1.0     0.0
C       0.0     2.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM