Count the total of a grouped by with pandas

Question

details = { 
    'order_number' : ['#1', '#2', '#3', '#4','#4'], 
    'disc_code' : ['no_discount', 'superman', 'hero', 'numero_uno','numero_uno'], 
    }
df = pd.DataFrame(details)

len(df) --> 6408
Each row attributes to one product, instead of one transaction. If I group every row to each order name, there are 3560 rows. len(df.groupby('order_number')) --> 3560

I want to count how many discount codes are used in total . (if no discount code is used, the value is 'no_discount')

In SQL, the syntax probably looks like this:

SELECT COUNT(*)
FROM transactions
GROUP BY order_number
WHERE discount_code != 'no_discount'

Answer 1

Use boolean indexing with GroupBy.size if need count per order_number :

df1 = (df[df['disc_code'].ne('no_discount')]
           .groupby('order_number')
           .size()
           .reset_index(name='count'))
print (df1)
  order_number  count
0           #2      1
1           #3      1
2           #4      2

If need count all values only count True s values by condition for not equal by Series.ne with sum :

out = df['disc_code'].ne('no_discount').sum()

Count the total of a grouped by with pandas

Question

1 answers

solution1
0 ACCPTED 2020-11-11 09:13:44

Count the total of a grouped by with pandas

Question

1 answers

solution1 0 ACCPTED 2020-11-11 09:13:44

solution1
0 ACCPTED 2020-11-11 09:13:44