I have DataFrame named 'concated'. It has columns: 'amount' - with a sums of transactions, 'mcccode_trtype' with a merchant type. I need to count only negative amounts of transactions by merchant code and count the mean of these transactions. And I need to filter merchants with more than 10 transactions.
So, I wrote this code:
res=concated[concated.amount<0].groupby('mcccode_trtype')['amount'].agg(['count', 'mean'])
The problem: How can I filter this by 'count' column in one line? I guess I should use 'filter' + lambda function, but finally failed with syntax.
Please, help.
Sample data:
from numpy import nan
concated=pd.DataFrame(
{0: {'amount': -2245.92,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '0 10:23:26',
'tr_description': 'DESCR2',
'tr_type': 1030},
1: {'amount': -5614.79,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '6 07:08:31',
'tr_description': 'DESCR2',
'tr_type': 1030},
2: {'amount': -1122.96,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '8 07:06:10',
'tr_description': 'DESCR2',
'tr_type': 1030},
3: {'amount': -2245.92,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '11 08:49:03',
'tr_description': 'DESCR2',
'tr_type': 1030},
4: {'amount': -2245.92,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '11 14:12:08',
'tr_description': 'DESCR2',
'tr_type': 1030},
5: {'amount': -2245.92,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '11 14:15:30',
'tr_description': 'DESCR2',
'tr_type': 1030},
6: {'amount': -2245.92,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '13 11:17:34',
'tr_description': 'DESCR2',
'tr_type': 1030},
7: {'amount': -2245.92,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '18 07:39:05',
'tr_description': 'DESCR2',
'tr_type': 1030},
8: {'amount': -449.18,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '19 13:24:35',
'tr_description': 'DESCR2',
'tr_type': 1030},
9: {'amount': -1122.96,
'customer_id': 39026145,
'gender': 1.0,
'mcc_code': 4814,
'mcc_description': 'DESCR1',
'mcccode_trtype': '48141030',
'term_id': nan,
'tr_datetime': '19 13:25:31',
'tr_description': 'DESCR2',
'tr_type': 1030}}).transpose()
concated
You can use .loc
in combination with a lambda function as follows:
res = concated[concated.amount<0].groupby('mcccode_trtype')['amount'].agg(['count', 'mean']).loc[lambda x: x["count"] > 10]
Use DataFrame.query
for filter by column count
:
#if necessary
concated.amount = concated.amount.astype(float)
res=concated[concated.amount<0].groupby('mcccode_trtype')['amount'].agg(['count', 'mean']).query('count == 10')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.