I am using my bank transactions data to bucket them into different categories.
This is the data frame df I have:
Date Transaction Debit Credit Balance
13/12/21 ECOM PUR/ZOMATO/xxx 200 - 30000
13/12/21 UPI/P2M/xxx/SWIGGY/ 500 - 29500
14/12/21 NEFT/xxxxx/ - 30000 59500
I need to search for string matches in the transaction column and bucket them into categories in a new column.
Eg SWIGGY and ZOMATO gets bucketed as food and NEFT is bucketed as salary
Desired output:
Date Transaction Debit Credit Balance Categories
13/12/21 ECOM PUR/ZOMATO/xxx 200 - 30000 Food
13/12/21 UPI/P2M/xxx/SWIGGY/ 500 - 29500 Food
14/12/21 NEFT/xxxxx/ - 30000 59500 Salary
Eg. in R, for string matching I would use something like
df[grepl('SWIGGY|ZOMATO', transactions)]$Categories <- 'Food'
How do I go about this in python? Really new to python, would appreciate the help. TIA:)
Create a dict mapping of your categories then split your Transaction
column and find match:
# Your categories here
categories = {'Food': ['SWIGGY', 'ZOMATO'],
'Salary': ['NEFT']}
# create a reverse dict of categories
mappings = {v: k for k, l in categories.items() for v in l}
df['Categories'] = df['Transaction'].str.split('/').explode().map(mappings).dropna()
Output:
>>> df
Date Transaction Debit Credit Balance Categories
0 13/12/21 ECOM PUR/ZOMATO/xxx 200 - 30000 Food
1 13/12/21 UPI/P2M/xxx/SWIGGY/ 500 - 29500 Food
2 14/12/21 NEFT/xxxxx/ - 30000 59500 Salary
>>> categories
{'Food': ['SWIGGY', 'ZOMATO'], 'Salary': ['NEFT']}
>>> mappings
{'SWIGGY': 'Food',
'ZOMATO': 'Food',
'NEFT': 'Salary'}
Update : In the case of you have multiple match categories on one row and/or no categories, you can use:
categories = {'Food': ['SWIGGY', 'ZOMATO'],
'Cream': ['ECOM PUR']}
mappings = {v: k for k, l in categories.items() for v in l}
df['Categories'] = \
df['Transaction'].str.split('/').explode().map(mappings).dropna() \
.groupby(level=0).apply(lambda x: ','.join(set(x))) \
.reindex(df.index).fillna('Other')
Output:
>>> df
Date Transaction Debit Credit Balance Categories
0 13/12/21 ECOM PUR/ZOMATO/xxx 200 - 30000 Cream,Food
1 13/12/21 UPI/P2M/xxx/SWIGGY/ 500 - 29500 Food
2 14/12/21 NEFT/xxxxx/ - 30000 59500 Other
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.