简体   繁体   中英

Group by multi-level category and return sum of n-largest in each category (n is different for each category)

I have a pandas dataframe (df) representing monthly expenses by different individuals. The first column in the dataframe refers to the person ID, the second column refers to the expense category, and the third column refers to the amount being spent. See the example table below:

d = {'PersonID': ['A','A','A','A','A','A','A','A','B','B','B','B','B','B'], 'Category': ['Food','Food','Food','Food','Travel','Travel','Travel','Travel','Food','Food','Food','Travel','Travel','Travel'], 'Expenditure':[10,15,5,20,500,100,1000,2000,10,30,10,800,1000,400]}
df = pd.DataFrame(data=d)

在此处输入图片说明

For each person, I'd like to get the sum of the THREE largest expenses in the Food category, and the sum of the TWO largest expenses in the Travel category.

For the example table above, I want the following table:

在此处输入图片说明

I am trying to use the following code but the problem is that I cannot specify different N-largest expenses in different categories.

df.groupby(['PersonID','Category'])['Expenditure'].nlargest(2).sum(level=0)

On way to do it is to split your dataframe by category first then groupby sum and concatenate results together afterwards:

pd.concat([
df.query('Category == "Food"').groupby(['PersonID','Category'])['Expenditure'].nlargest(3).sum(level=[0,1]),
df.query('Category == "Travel"').groupby(['PersonID','Category'])['Expenditure'].nlargest(2).sum(level=[0,1])
])

Output:

PersonID  Category
A         Food          45
B         Food          50
A         Travel      3000
B         Travel      1800
Name: Expenditure, dtype: int64

Using dictionary and list comprehension:

d = {'Food':2,
     'Travel':3}

pd.concat([df[df['Category'] == c].groupby(['PersonID','Category'])['Expenditure'].nlargest(n).sum(level=[0,1]) for c,n in d.items()])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM