Python Pandas：如何将DataFrame groupby的结果放入具有新列名的数据框中？

Question

I have a data frame with two columns 'id' and 'time'. 我有一个带有两列“ id”和“ time”的数据框。 Need to compute mean times for ids and put result into new data frame with new column name. 需要计算id的平均时间，并将结果放入具有新列名的新数据框中。 Input data frame: 输入数据帧：

        id  time
0    1     1
1    1     1
2    1     1
3    1     1
4    1     2
5    1     2
6    2     1
7    2     1
8    2     2
9    2     2
10   2     2
11   2     2

My code: 我的代码：

import pandas as pd

my_dict = {
    'id':  [1,1,1, 1,1,1, 2,2,2, 2,2,2],
    'time':[1,1,1, 1,2,2, 1,1,2, 2,2,2]
    }

df = pd.DataFrame(my_dict)
x = df.groupby(['id'])['time'].mean()

# x is a pandas.core.series.Series                                                                  
type(x)

y = x.to_frame()
# y is pandas.core.frame.DataFrame                                                                  
type(y)
list(y)

Running this code results in: 运行此代码将导致：

In [14]: y                                                                                              
Out[14]:                                                                                                
        time                                                                                            
id                                                                                                      
1   1.333333                                                                                            
2   1.666667

Groupby returns Pandas series 'x' which I then convert to data frame 'y'. Groupby返回熊猫系列“ x”，然后将其转换为数据框“ y”。 How can I change in the output 'y' data frame column name from 'time' to something else, for example 'mean'? 如何将输出的“ y”数据框列名从“时间”更改为其他名称，例如“平均值”？ Ideally I need output data frame with two columns : 'id' and 'mean'. 理想情况下，我需要具有两列的输出数据框：“ id”和“ mean”。 How to do this? 这个怎么做？

Update2: 更新2：

y = x.to_frame('mean').reset_index() y = x.to_frame（'mean'）。reset_index（）

Solves the problem! 解决问题！

Answer 1

You can use agg to pass a name. 您可以使用agg传递名称。 The key is the name of the column and the value is the alias for the aggregate function. 键是列的名称，值是聚合函数的别名。 as_index=False is for id column to stay as a column: as_index=False是让id列保留为列：

df.groupby(['id'], as_index=False)['time'].agg({'mean': 'mean'})
Out: 
   id      mean
0   1  1.333333
1   2  1.666667

Using your Series x , this would also have worked: 使用Series x ，这也可以工作：

x.to_frame('mean').reset_index()
Out: 
   id      mean
0   1  1.333333
1   2  1.666667

Python Pandas：如何将DataFrame groupby的结果放入具有新列名的数据框中？

问题描述

1 个解决方案

解决方案1
0 已采纳 2016-12-18 11:04:50

Python Pandas：如何将DataFrame groupby的结果放入具有新列名的数据框中？

问题描述

1 个解决方案

解决方案1 0 已采纳 2016-12-18 11:04:50

解决方案1
0 已采纳 2016-12-18 11:04:50