简体   繁体   English

熊猫数据框:按两列分组,然后对另一列取平均值

[英]Pandas dataframe: Group by two columns and then average over another column

Assuming that I have a dataframe with the following values: 假设我有一个具有以下值的数据框:

df:
col1    col2    value
1       2       3
1       2       1
2       3       1

I want to first groupby my dataframe based on the first two columns (col1 and col2) and then average over values of the thirs column (value). 我想首先根据前两列(col1和col2)对数据框进行分组,然后对第三列的值(值)进行平均。 So the desired output would look like this: 因此,所需的输出将如下所示:

col1    col2    avg-value
1       2       2
2       3       1

I am using the following code: 我正在使用以下代码:

columns = ['col1','col2','avg']
df = pd.DataFrame(columns=columns)
df.loc[0] = [1,2,3]
df.loc[1] = [1,3,3]
print(df[['col1','col2','avg']].groupby('col1','col2').mean())

which gets the following error: 出现以下错误:

ValueError: No axis named col2 for object type <class 'pandas.core.frame.DataFrame'>

Any help would be much appreciated. 任何帮助将非常感激。

You need to pass a list of the columns to groupby, what you passed was interpreted as the axis param which is why it raised an error: 您需要将列的列表传递给groupby,您传递的内容被解释axis参数,这就是它引发错误的原因:

In [30]:
columns = ['col1','col2','avg']
df = pd.DataFrame(columns=columns)
df.loc[0] = [1,2,3]
df.loc[1] = [1,3,3]

print(df[['col1','col2','avg']].groupby(['col1','col2']).mean())
           avg
col1 col2     
1    2       3
     3       3

If you want to group by multiple columns, you should put them in a list: 如果要按多列分组,则应将它们放在列表中:

columns = ['col1','col2','value']
df = pd.DataFrame(columns=columns)
df.loc[0] = [1,2,3]
df.loc[1] = [1,3,3]
df.loc[2] = [2,3,1]
print(df.groupby(['col1','col2']).mean())

Or slightly more verbose, for the sake of getting the word 'avg' in your aggregated dataframe: 或稍微冗长一些,以便在聚合数据框中使用单词“ avg”:

import numpy as np
columns = ['col1','col2','value']
df = pd.DataFrame(columns=columns)
df.loc[0] = [1,2,3]
df.loc[1] = [1,3,3]
df.loc[2] = [2,3,1]
print(df.groupby(['col1','col2']).agg({'value': {'avg': np.mean}}))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 熊猫数据框:按两列分组,然后对第三列取平均值 - Pandas dataframe: Group by two columns and then average the third column Pandas dataframe 按列分组并在不同列上应用最小值、最大值、平均值 - Pandas dataframe group by column and apply min, max, average on different columns 按一列分组,然后平均各列的 rest。 Pandas dataframe - Group by one column and then average each of the rest of the columns. Pandas dataframe Pandas DataFrame用两列分组,并添加列作为移动平均值 - Pandas DataFrame Groupby two columns and add column for moving average 按熊猫数据框和平均数组中的列分组 - Group by column in pandas dataframe and average arrays DataFrame:按一列分组并平均其他列 - DataFrame: Group by one column and average other columns 熊猫数据框,为每一组添加一列作为另一列的移动平均值 - pandas dataframe, add one column as moving average of another column for each group 在 pandas dataframe 中添加一列,这是基于其他列条件的另一列的平均值 - Add a column in a pandas dataframe that is the average of another column based on conditions of other columns 以列的平均值对数据框进行分组 - Group by of dataframe with average of a column Python Pandas - 按多列的滚动平均值分组 - Python Pandas - Group by rolling average over multiple columns
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM