将列标题添加到新的pandas数据帧

Question

I am creating a new pandas dataframe from a previous dataframe using the .groupby and .size methods. 我从以前的数据框使用创建一个新的数据框熊猫.groupby和.size方法。

[in] results = df.groupby(["X", "Y", "Z", "F"]).size()

[out]
    9   27/02/2016  1   N   326
    9   27/02/2016  1   S   332
    9   27/02/2016  2   N   280
    9   27/02/2016  2   S   353
    9   27/02/2016  3   N   177

This behaves as expected, however the result is a dataframe with no column headers. 这表现得如预期，但结果是没有列标题的数据帧。

This SO question states that the following adds column names to the generated dataframe 此SO问题表明以下内容将列名添加到生成的数据帧中

[in] results.columns = ["X","Y","Z","F","Count"]

However, this does not seem to have any impact at all. 但是，这似乎没有任何影响。

[out]
        9   27/02/2016  1   N   326
        9   27/02/2016  1   S   332
        9   27/02/2016  2   N   280
        9   27/02/2016  2   S   353
        9   27/02/2016  3   N   177

Answer 1

What you're seeing are your grouped columns as the index, if you call reset_index then it restores the column names 你看到的是你的分组列作为索引，如果你调用reset_index然后它恢复列名

so 所以

results = df.groupby(["X", "Y", "Z", "F"]).size()
results.reset_index()

should work 应该管用

In [11]:
df.groupby(["X","Y","Z","F"]).size()

Out[11]:
X  Y           Z  F
9  27/02/2016  1  N    1
                  S    1
               2  N    1
                  S    1
               3  N    1
dtype: int64

In [12]:    
df.groupby(["X","Y","Z","F"]).size().reset_index()

Out[12]:
   X           Y  Z  F  0
0  9  27/02/2016  1  N  1
1  9  27/02/2016  1  S  1
2  9  27/02/2016  2  N  1
3  9  27/02/2016  2  S  1
4  9  27/02/2016  3  N  1

Additionally you can achieve what you want by using count : 此外，您可以使用count来实现您想要的效果：

In [13]:
df.groupby(["X","Y","Z","F"]).count().reset_index()

Out[13]:
   X           Y  Z  F  Count
0  9  27/02/2016  1  N      1
1  9  27/02/2016  1  S      1
2  9  27/02/2016  2  N      1
3  9  27/02/2016  2  S      1
4  9  27/02/2016  3  N      1

You could also pass param as_index=False here: 你也可以在这里传递param as_index=False ：

In [15]:
df.groupby(["X","Y","Z","F"], as_index=False).count()

Out[15]:
   X           Y  Z  F  Count
0  9  27/02/2016  1  N      1
1  9  27/02/2016  1  S      1
2  9  27/02/2016  2  N      1
3  9  27/02/2016  2  S      1
4  9  27/02/2016  3  N      1

This is normally fine but some aggregate functions will bork if you try to use aggregation methods on columns whose dtypes cannot be aggregated, for instance if you have str dtypes and you decide to call mean for instance. 这通常很好，但是如果你尝试在无法聚合dtypes列上使用聚合方法，那么一些聚合函数将会出现问题，例如，如果你有str dtypes并且你决定调用mean 。

Answer 2

你可以使用as_index=False .groupby()函数的as_index=False参数：

results = df.groupby(["X", "Y", "Z", "F"], as_index=False).size().rename(columns={0:'Count'})

将列标题添加到新的pandas数据帧

问题描述

2 个解决方案

解决方案1
6 已采纳 2016-05-04 22:04:47

解决方案2
2 2016-05-04 22:08:46

将列标题添加到新的pandas数据帧

问题描述

2 个解决方案

解决方案1 6 已采纳 2016-05-04 22:04:47

解决方案2 2 2016-05-04 22:08:46

解决方案1
6 已采纳 2016-05-04 22:04:47

解决方案2
2 2016-05-04 22:08:46