简体   繁体   中英

Group pandas dataframe and calculate mean for multiple columns

I'm trying to group a pandas dataframe by a column and then also calculate the mean for multiple columns. In the sample below I would like to group by the 'category' column and then calculate the mean for the 'score' and 'priority' columns. All three columns should be in the resulting dataframe.

I am able to group and calculate the mean for the first column but I don't know how to add the second column. Below my attempt.

Any guidance greatly appreciated.

import pandas as pd

data = [['A', 2, 1], ['A', 4, 2], ['B', 5, 3], ['B', 2, 3]]
df = pd.DataFrame(data, columns=['category', 'score', 'priority'])
print(df)

#  This fails:
results_df = df.groupby('category')['score'].agg(['mean',])['priority'].agg(['mean',])
print(results_df)
df.groupby("category", as_index=False).mean()

Your first three lines correctly print out the result

  category  score  priority
0        A      2         1
1        A      4         2
2        B      5         3
3        B      2         3

Now add this line:

df.groupby("category").mean(numeric_only=True)

and you will see:

          score  priority
category                 
A           3.0       1.5
B           3.5       3.0

which is probably what you're looking for. Running mean(numeric_only=True) on a DataFrame calculates means for all numeric columns. (You can leave it out right now, but you'll get a deprecated-feature message.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM