Pandas pivot_table dropna 参数未按预期工作

Question

I'm pivoting a dataframe to look at unique value counts within groups.我正在旋转 dataframe 以查看组内的唯一值计数。 I know some of the grouping columns have null values and I want to include them.我知道一些分组列有 null 值，我想包括它们。 I can do this easily with a .groupby([...], dropna=False) but I would like to use .pivot_table as it handles the unstacking, null-filling, totaling, etc all in one function.我可以使用.groupby([...], dropna=False)轻松完成此操作，但我想使用.pivot_table因为它在一个 function 中处理取消堆叠、空值填充、总计等。

Sample Data (taken from python pandas: pivot_table silently drops indices with nans )示例数据（取自python pandas：pivot_table 使用 nans 静默删除索引）

a = [['a', 'b', 12, 12, 12], ['a', np.nan, 12.3, 233., 12], ['b', 'a', 123.23, 123, 1], ['a', 'b', 1, 1, 1.]]

df = pd.DataFrame(a, columns=['a', 'b', 'c', 'd', 'e'])

print(df)

   a    b       c      d     e
0  a    b   12.00   12.0  12.0
1  a  NaN   12.30  233.0  12.0
2  b    a  123.23  123.0   1.0
3  a    b    1.00    1.0   1.0

Using `.groupby` to get desired results使用`.groupby`获得想要的结果

using_groupby = df.groupby([
    "a",
    "b"
], dropna=False).c.nunique().unstack(fill_value=0)

print(using_groupby)



b  a  b  NaN
a           
a  0  2    1
b  1  0    0

Code I expected would yield similar results using `.pivot_table`我期望的代码使用`.pivot_table`会产生类似的结果

using_pivot_table = df.pivot_table(
    index="a",
    columns="b",
    values="c",
    aggfunc="nunique",
    fill_value=0,
    dropna=False
)

print(using_pivot_table)



b  a  b
a      
a  0  2
b  1  0

Question问题

Is this a bug in the pivot_table function?这是pivot_table function 中的错误吗？ Or am I not understanding the use of the dropna param?还是我不了解dropna参数的使用？

Version Info版本信息

Python - 3.8.5 Python - 3.8.5
Pandas - 1.1.3 Pandas - 1.1.3

Answer 1

dropna=False means do not include columns whose entries are all NaN; dropna=False表示不包括条目全部为 NaN 的列； your issue is that the pivot table is not displaying a column with NaN as the column name.您的问题是 pivot 表未显示以 NaN 作为列名的列。 If you change the NaN value to another string, then the pivot table works as expected.如果将 NaN 值更改为另一个字符串，则 pivot 表将按预期工作。

df['b'] = df['b'].fillna('No Value')

Pandas pivot_table dropna 参数未按预期工作

问题描述

Sample Data (taken from python pandas: pivot_table silently drops indices with nans )示例数据（取自python pandas：pivot_table 使用 nans 静默删除索引）

Using `.groupby` to get desired results使用`.groupby`获得想要的结果

Code I expected would yield similar results using `.pivot_table`我期望的代码使用`.pivot_table`会产生类似的结果

Question问题

Version Info版本信息

1 个解决方案

解决方案1
0 2021-04-23 17:51:47

Pandas pivot_table dropna 参数未按预期工作

问题描述

Sample Data (taken from python pandas: pivot_table silently drops indices with nans )示例数据（取自python pandas：pivot_table 使用 nans 静默删除索引）

Using .groupby to get desired results使用.groupby获得想要的结果

Code I expected would yield similar results using .pivot_table我期望的代码使用.pivot_table会产生类似的结果

Question问题

Version Info版本信息

1 个解决方案

解决方案1 0 2021-04-23 17:51:47

Using `.groupby` to get desired results使用`.groupby`获得想要的结果

Code I expected would yield similar results using `.pivot_table`我期望的代码使用`.pivot_table`会产生类似的结果

解决方案1
0 2021-04-23 17:51:47