[英]Adding Column Headers to new pandas dataframe
I am creating a new pandas dataframe from a previous dataframe using the .groupby
and .size
methods. 我从以前的数据框使用创建一个新的数据框熊猫.groupby
和.size
方法。
[in] results = df.groupby(["X", "Y", "Z", "F"]).size()
[out]
9 27/02/2016 1 N 326
9 27/02/2016 1 S 332
9 27/02/2016 2 N 280
9 27/02/2016 2 S 353
9 27/02/2016 3 N 177
This behaves as expected, however the result is a dataframe with no column headers. 这表现得如预期,但结果是没有列标题的数据帧。
This SO question states that the following adds column names to the generated dataframe 此SO问题表明以下内容将列名添加到生成的数据帧中
[in] results.columns = ["X","Y","Z","F","Count"]
However, this does not seem to have any impact at all. 但是,这似乎没有任何影响。
[out]
9 27/02/2016 1 N 326
9 27/02/2016 1 S 332
9 27/02/2016 2 N 280
9 27/02/2016 2 S 353
9 27/02/2016 3 N 177
What you're seeing are your grouped columns as the index, if you call reset_index
then it restores the column names 你看到的是你的分组列作为索引,如果你调用reset_index
然后它恢复列名
so 所以
results = df.groupby(["X", "Y", "Z", "F"]).size()
results.reset_index()
should work 应该管用
In [11]:
df.groupby(["X","Y","Z","F"]).size()
Out[11]:
X Y Z F
9 27/02/2016 1 N 1
S 1
2 N 1
S 1
3 N 1
dtype: int64
In [12]:
df.groupby(["X","Y","Z","F"]).size().reset_index()
Out[12]:
X Y Z F 0
0 9 27/02/2016 1 N 1
1 9 27/02/2016 1 S 1
2 9 27/02/2016 2 N 1
3 9 27/02/2016 2 S 1
4 9 27/02/2016 3 N 1
Additionally you can achieve what you want by using count
: 此外,您可以使用count
来实现您想要的效果:
In [13]:
df.groupby(["X","Y","Z","F"]).count().reset_index()
Out[13]:
X Y Z F Count
0 9 27/02/2016 1 N 1
1 9 27/02/2016 1 S 1
2 9 27/02/2016 2 N 1
3 9 27/02/2016 2 S 1
4 9 27/02/2016 3 N 1
You could also pass param as_index=False
here: 你也可以在这里传递param as_index=False
:
In [15]:
df.groupby(["X","Y","Z","F"], as_index=False).count()
Out[15]:
X Y Z F Count
0 9 27/02/2016 1 N 1
1 9 27/02/2016 1 S 1
2 9 27/02/2016 2 N 1
3 9 27/02/2016 2 S 1
4 9 27/02/2016 3 N 1
This is normally fine but some aggregate functions will bork if you try to use aggregation methods on columns whose dtypes
cannot be aggregated, for instance if you have str
dtypes and you decide to call mean
for instance. 这通常很好,但是如果你尝试在无法聚合dtypes
列上使用聚合方法,那么一些聚合函数将会出现问题,例如,如果你有str
dtypes并且你决定调用mean
。
你可以使用as_index=False
.groupby()
函数的as_index=False
参数:
results = df.groupby(["X", "Y", "Z", "F"], as_index=False).size().rename(columns={0:'Count'})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.