简体   繁体   中英

pandas groupby count based on conditions

I am trying to add a column to a dataframe that would give me a count of the type of payment returns that a customer has on their account.

This is what the dataframe looks like:

CustomerID       Return$     Payment Method
000010           10          Credit Card
000010           15          Credit Card
000011           10          Check
000011           15          Credit Card
000011           10          Credit Card

This is the expected outcome:

CustomerID     Return$   Payment Method   CC Return Count  Check Return Count
000010           10        Credit Card         2                  0
000010           15        Credit Card         2                  0
000011           10        Check               2                  1
000011           15        Credit Card         2                  1
000011           10        Credit Card         2                  1

This is the code that I have tried, but it only gives me a column with boolean values:

return_df['CC Boolean']= return_df.groupby(['CustomerID'])['Payment 
Method'].apply(lambda x: x=='Credit Card')

This other piece of code gives a total count of payment regardless of payment method:

return_df['Counter']= return_df.groupby('Customer ID')['Payment Method'].transform('count')
method_dict = df.groupby('CustomerID')['Payment Method'].value_counts().unstack().fillna(0).to_dict()

df['CC Return Count'] = df['CustomerID'].map(method_dict['Credit Card'])
df['Check Return Count'] = df['CustomerID'].map(method_dict['Check'])

Method dict looks like:

{'Check': {10: 0.0, 11: 1.0}, 'Credit Card': {10: 2.0, 11: 2.0}}

output:

df

>>>


CustomerID  Return$ Payment Method  CC Return Count Check Return Count
0   10      10      Credit Card     2.0             0.0
1   10      15      Credit Card     2.0             0.0
2   11      10      Check           2.0             1.0
3   11      15      Credit Card     2.0             1.0
4   11      10      Credit Card     2.0             1.0

You can use groupby on CustomerID and then count the number that of rows with 'Check' and 'Credit Card' for each customer. Using transform will preserve the original structure:

df['check'] = df.groupby('CustomerID')['Payment Method'].transform(lambda x: sum(x == 'Check'))
df['credit'] = df.groupby('CustomerID')['Payment Method'].transform(lambda x: sum(x == 'Credit Card'))

Output:

   CustomerID  Return$ Payment Method  check  credit
0          10       10    Credit Card      0       2
1          10       15    Credit Card      0       2
2          11       10          Check      1       2
3          11       15    Credit Card      1       2
4          11       10    Credit Card      1       2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM