简体   繁体   English

get_dummies()之后的Pandas 0.21 reindex()

[英]Pandas 0.21 reindex() after get_dummies()

Background 背景

My project is doing a Pandas upgrade from 0.19.2 to 0.21.0. 我的项目正在将Pandas从0.19.2升级到0.21.0。 In the project, I have a DataFrame with one categorical column. 在项目中,我有一个带有一个分类列的DataFrame。 And I use get_dummies() to encode it, and then use reindex() to filter columns. 然后,我使用get_dummies()对其进行编码,然后使用reindex()来过滤列。 However, if the columns arg in reindex() contain non-encoded column, the reindex() fails. 但是,如果reindex()中的arg列包含未编码的列,则reindex()会失败。

Sample Code 样例代码

The code below works for 0.19.2 but fails under 0.21.0. 以下代码适用于0.19.2,但在0.21.0下失败。

df = pd.DataFrame.from_items([('GDP', [1, 2]),('Nation', ['AB', 'CD'])])
df = pd.get_dummies(df, columns=['Nation'], sparse=True)  # SparseDataFrame
df.reindex(columns=['GDP'])  # Fails :/

The error message is 错误消息是

df.reindex(columns=['GDP'])
....
TypeError: values must be SparseArray

What I Hope to Achieve 我希望实现的目标

Use reindex(columns=...) to filter selected columns contain encoded and non-encoded columns. 使用reindex(columns = ...)筛选包含已编码和未编码列的所选列。 Thanks! 谢谢!

Update (2018-01-17) 更新(2018-01-17)

An issue is created at GitHub . GitHub上创建了一个问题。

This certainly seems like a bug. 当然,这似乎是一个错误。 As of v0.21, they've reworked a lot of their reindex API, so it seems something could've broken somewhere. 从v0.21开始,他们重新设计了许多reindex API,因此似乎某些地方可能出现了问题。

I don't have an answer, but I do have a workaround, hopefully it should do: You'll need to first transpose, and then reindex. 我没有答案,但是我有解决方法,希望它可以解决:您需要先转置, 然后重新索引。

df.T.reindex(index=['GDP']).T

   GDP
0    1
1    2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM