简体   繁体   English

Pandas pivot_table dropna 参数未按预期工作

[英]Pandas pivot_table dropna param not working as expected

I'm pivoting a dataframe to look at unique value counts within groups.我正在旋转 dataframe 以查看组内的唯一值计数。 I know some of the grouping columns have null values and I want to include them.我知道一些分组列有 null 值,我想包括它们。 I can do this easily with a .groupby([...], dropna=False) but I would like to use .pivot_table as it handles the unstacking, null-filling, totaling, etc all in one function.我可以使用.groupby([...], dropna=False)轻松完成此操作,但我想使用.pivot_table因为它在一个 function 中处理取消堆叠、空值填充、总计等。

Sample Data (taken from python pandas: pivot_table silently drops indices with nans )示例数据(取自python pandas:pivot_table 使用 nans 静默删除索引

a = [['a', 'b', 12, 12, 12], ['a', np.nan, 12.3, 233., 12], ['b', 'a', 123.23, 123, 1], ['a', 'b', 1, 1, 1.]]

df = pd.DataFrame(a, columns=['a', 'b', 'c', 'd', 'e'])

print(df)

   a    b       c      d     e
0  a    b   12.00   12.0  12.0
1  a  NaN   12.30  233.0  12.0
2  b    a  123.23  123.0   1.0
3  a    b    1.00    1.0   1.0

Using .groupby to get desired results使用.groupby获得想要的结果

using_groupby = df.groupby([
    "a",
    "b"
], dropna=False).c.nunique().unstack(fill_value=0)

print(using_groupby)



b  a  b  NaN
a           
a  0  2    1
b  1  0    0

Code I expected would yield similar results using .pivot_table我期望的代码使用.pivot_table会产生类似的结果

using_pivot_table = df.pivot_table(
    index="a",
    columns="b",
    values="c",
    aggfunc="nunique",
    fill_value=0,
    dropna=False
)

print(using_pivot_table)



b  a  b
a      
a  0  2
b  1  0

Question问题

Is this a bug in the pivot_table function?这是pivot_table function 中的错误吗? Or am I not understanding the use of the dropna param?还是我不了解dropna参数的使用?

Version Info版本信息

  • Python - 3.8.5 Python - 3.8.5
  • Pandas - 1.1.3 Pandas - 1.1.3

dropna=False means do not include columns whose entries are all NaN; dropna=False表示不包括条目全部为 NaN 的列; your issue is that the pivot table is not displaying a column with NaN as the column name.您的问题是 pivot 表未显示以 NaN 作为列名的列。 If you change the NaN value to another string, then the pivot table works as expected.如果将 NaN 值更改为另一个字符串,则 pivot 表将按预期工作。

df['b'] = df['b'].fillna('No Value')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM