計算 Python Pandas 中的不同值

Question

我正在使用 pivot 表，嘗試編寫代碼來顯示每個客戶的消費者帳戶數量。 到目前為止，我有以下內容：

import pandas as pd
df1=pd.DataFrame({'custID':[1,1,2,2,2,3,3,4,4],
              'accountID':[1,2,1,2,3,1,2,1,2],
              'tenure_mo':[2,3,4,4,5,6,6,6,7],
             'account_type':['BusiNESS','CONSUMER',
                            'consumer',
                            'BUSINESS',
                            'BuSIness',
                            'CONSUmer',
                            'consumer',
                            'CONSUMER',
                            'BUSINESS']},columns=['custID','accountID','tenure_mo','account_type'])
print(df1)
df2=pd.DataFrame({'custID':[1,2,3,4],
             'cust_age':[20,35,50,85]},columns=['custID','cust_age'])

這是我想要的 output ：

custID num_cons_accounts
     1                 1
     2                 1
     3                 2
     4                 1

如何修改/擴展我的代碼以生成此 output？

Answer 1

根據您的描述，以下代碼應該可以工作：

df1=pd.DataFrame({'custID':[1,1,2,2,2,3,3,4,4],
              'accountID':[1,2,1,2,3,1,2,1,2],
              'tenure_mo':[2,3,4,4,5,6,6,6,7],
             'account_type':['BusiNESS','CONSUMER',
                            'consumer',
                            'BUSINESS',
                            'BuSIness',
                            'CONSUmer',
                            'consumer',
                            'CONSUMER',
                            'BUSINESS']},columns=['custID','accountID','tenure_mo','account_type'])

df1 = df1[df1['account_type'].str.lower() == "consumer"]

print(df1.groupby("custID").count())

Select 其中帳戶類型的小寫版本等於"consumer" ，然后獲取每個custID的計數。

output：

        accountID  tenure_mo  account_type
custID                                    
1               1          1             1
2               1          1             1
3               2          2             2
4               1          1             1

附注：如果您只想要一列，請刪除 rest :)

Answer 2

使用 set 使用 apply 和 lambda function 按 account_type2 查找帳戶的不同計數

 df1=pd.DataFrame({'custID':[1,1,2,2,2,3,3,4,4],
          'accountID':[1,2,1,2,3,1,2,1,2],
          'tenure_mo':[2,3,4,4,5,6,6,6,7],
         'account_type':['BusiNESS','CONSUMER','consumer','BUSINESS','BuSIness','CONSUmer',
                        'consumer', 'CONSUMER','BUSINESS']},columns=['custID','accountID','tenure_mo','account_type'])

 df1['account_type2']=df1['account_type'].apply(lambda row: row.lower())
 
 grouped=df1.groupby('custID').apply(lambda row: len(set(row.account_type2)))
 print(grouped)

output：

 custID distinct count
 1    2
 2    2
 3    1
 4    2

計算 Python Pandas 中的不同值

問題描述

2 個解決方案

解決方案1
4 2021-02-01 22:14:20

解決方案2
0 2021-04-21 17:23:16

計算 Python Pandas 中的不同值

問題描述

2 個解決方案

解決方案1 4 2021-02-01 22:14:20

解決方案2 0 2021-04-21 17:23:16

解決方案1
4 2021-02-01 22:14:20

解決方案2
0 2021-04-21 17:23:16