[英]Pandas pivot_table dropna param not working as expected
I'm pivoting a dataframe to look at unique value counts within groups.我正在旋转 dataframe 以查看组内的唯一值计数。 I know some of the grouping columns have null values and I want to include them.
我知道一些分组列有 null 值,我想包括它们。 I can do this easily with a
.groupby([...], dropna=False)
but I would like to use .pivot_table
as it handles the unstacking, null-filling, totaling, etc all in one function.我可以使用
.groupby([...], dropna=False)
轻松完成此操作,但我想使用.pivot_table
因为它在一个 function 中处理取消堆叠、空值填充、总计等。
a = [['a', 'b', 12, 12, 12], ['a', np.nan, 12.3, 233., 12], ['b', 'a', 123.23, 123, 1], ['a', 'b', 1, 1, 1.]]
df = pd.DataFrame(a, columns=['a', 'b', 'c', 'd', 'e'])
print(df)
a b c d e
0 a b 12.00 12.0 12.0
1 a NaN 12.30 233.0 12.0
2 b a 123.23 123.0 1.0
3 a b 1.00 1.0 1.0
.groupby
to get desired results.groupby
获得想要的结果using_groupby = df.groupby([
"a",
"b"
], dropna=False).c.nunique().unstack(fill_value=0)
print(using_groupby)
b a b NaN
a
a 0 2 1
b 1 0 0
.pivot_table
.pivot_table
会产生类似的结果using_pivot_table = df.pivot_table(
index="a",
columns="b",
values="c",
aggfunc="nunique",
fill_value=0,
dropna=False
)
print(using_pivot_table)
b a b
a
a 0 2
b 1 0
Is this a bug in the pivot_table
function?这是
pivot_table
function 中的错误吗? Or am I not understanding the use of the dropna
param?还是我不了解
dropna
参数的使用?
dropna=False
means do not include columns whose entries are all NaN; dropna=False
表示不包括条目全部为 NaN 的列; your issue is that the pivot table is not displaying a column with NaN as the column name.您的问题是 pivot 表未显示以 NaN 作为列名的列。 If you change the NaN value to another string, then the pivot table works as expected.
如果将 NaN 值更改为另一个字符串,则 pivot 表将按预期工作。
df['b'] = df['b'].fillna('No Value')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.