简体   繁体   English

pandas 重新索引多索引无法正常工作

[英]pandas reindexing multiindex not working properly

I have a pandas ( version 1.0.5 ) DataFrame with a MultiIndex of two levels, fi like:我有一个pandas版本 1.0.5DataFrame具有两个级别的MultiIndex ,例如:

mi = pd.MultiIndex.from_product((('a', 'c'), (5, 12)))
np.random.seed(123)
df = pd.DataFrame(data=np.random.rand(4, 2), index=mi, columns=['x', 'y'])

I want to reindex the first level of the MultiIndex to contain the keys ['a', 'b', 'c', 'd'] .我想重新reindex第一级以包含键['a', 'b', 'c', 'd'] Missing values should be filled with np.nan .缺失值应填写np.nan

For a non-multiindex dataframe, I'd simply reindex with df.reindex(index=['a', 'b', 'c', 'd']) .对于非多索引 dataframe,我只需使用df.reindex(index=['a', 'b', 'c', 'd'])重新索引。
Now with the MultiIndex , I assumed that this should work (I also tried all other combinations of the arguments labels , axis and index ):现在使用MultiIndex ,我认为这应该可以工作(我还尝试了 arguments labelsaxisindex的所有其他组合):

df.reindex(index=['a', 'b', 'c', 'd'], level=0)

But instead it seems to completely ignore the reindex method and returns the unaltered dataframe:但相反,它似乎完全忽略了reindex方法并返回未更改的 dataframe:

             x         y
a 5   0.696469  0.286139
  12  0.226851  0.551315
c 5   0.719469  0.423106
  12  0.980764  0.684830

The only way I can reindex the MultiIndex, is by fully generating a new MultiIndex :我可以重新索引 MultiIndex 的唯一方法是完全生成一个新的MultiIndex

df.reindex(index=pd.MultiIndex.from_product((
    ['a', 'b', 'c', 'd'], df.index.get_level_values(1).unique())))

Imho there must be an easier way to do it, otherwise I don't see any use in the argument level of the reindex method.恕我直言,必须有一种更简单的方法来做到这一点,否则我看不到reindex方法的参数level有任何用处。 Furthermore I quite often have several index levels, which makes reindexing extremely cumbersome.此外,我经常有几个索引级别,这使得重新索引非常麻烦。

Did I miss anything?我错过了什么吗? Any idea how to reindex directly without having to explicitly generate a new multiindex?知道如何直接重新索引而不必显式生成新的多索引吗?

This behaviour is not expected.这种行为不是预期的。 Passing the level argument to reindex on a MultiIndex appears to be broken still in pandas version 1.2.3.level参数传递给MultiIndex上的reindex似乎在pandas版本 1.2.3 中仍然被破坏。 There is an issue on github covering this: github 上存在一个问题,涉及以下内容:

https://github.com/pandas-dev/pandas/issues/25460 https://github.com/pandas-dev/pandas/issues/25460

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM