[英]Python Pandas: How to put result of DataFrame groupby in a data frame with new column name?
I have a data frame with two columns 'id' and 'time'. 我有一个带有两列“ id”和“ time”的数据框。 Need to compute mean times for ids and put result into new data frame with new column name.
需要计算id的平均时间,并将结果放入具有新列名的新数据框中。 Input data frame:
输入数据帧:
id time
0 1 1
1 1 1
2 1 1
3 1 1
4 1 2
5 1 2
6 2 1
7 2 1
8 2 2
9 2 2
10 2 2
11 2 2
My code: 我的代码:
import pandas as pd
my_dict = {
'id': [1,1,1, 1,1,1, 2,2,2, 2,2,2],
'time':[1,1,1, 1,2,2, 1,1,2, 2,2,2]
}
df = pd.DataFrame(my_dict)
x = df.groupby(['id'])['time'].mean()
# x is a pandas.core.series.Series
type(x)
y = x.to_frame()
# y is pandas.core.frame.DataFrame
type(y)
list(y)
Running this code results in: 运行此代码将导致:
In [14]: y
Out[14]:
time
id
1 1.333333
2 1.666667
Groupby returns Pandas series 'x' which I then convert to data frame 'y'. Groupby返回熊猫系列“ x”,然后将其转换为数据框“ y”。 How can I change in the output 'y' data frame column name from 'time' to something else, for example 'mean'?
如何将输出的“ y”数据框列名从“时间”更改为其他名称,例如“平均值”? Ideally I need output data frame with two columns : 'id' and 'mean'.
理想情况下,我需要具有两列的输出数据框:“ id”和“ mean”。 How to do this?
这个怎么做?
Update2: 更新2:
y = x.to_frame('mean').reset_index() y = x.to_frame('mean')。reset_index()
Solves the problem! 解决问题!
You can use agg to pass a name. 您可以使用agg传递名称。 The key is the name of the column and the value is the alias for the aggregate function.
键是列的名称,值是聚合函数的别名。
as_index=False
is for id
column to stay as a column: as_index=False
是让id
列保留为列:
df.groupby(['id'], as_index=False)['time'].agg({'mean': 'mean'})
Out:
id mean
0 1 1.333333
1 2 1.666667
Using your Series x
, this would also have worked: 使用Series
x
,这也可以工作:
x.to_frame('mean').reset_index()
Out:
id mean
0 1 1.333333
1 2 1.666667
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.