填写熊猫数据框中的缺失数据

Question

I have a Pandas dataframe with two indexes 我有一个带有两个索引的Pandas数据框

                              Column1
indexA   indexB                        
1001     aaa                        1
         bbb                        1
         ccc                        1
1002     ddd                        1
         eee                        1

and would like indexB to have the same values for each value of indexA : 并希望indexB为对于每个值相同的值indexA ：

                              Column1
indexA   indexB                        
1001     aaa                        1
         bbb                        1
         ccc                        1
         ddd                        0
         eee                        0
1002     aaa                        0
         bbb                        0
         ccc                        0
         ddd                        1
         eee                        1

My first thought was to unstack, fillna with 0 and then stack it, but this seems like overkill. 我的第一个想法是拆栈，用0填充fillna，然后将其堆叠，但这似乎有点过头了。 Is there an easier method? 有没有更简单的方法？

EDIT: Alexander's answer below works though it takes a long time (my original dataframe has 350k rows). 编辑：尽管下面的时间很长，亚历山大的答案仍然有效（我的原始数据帧有35万行）。 I changed that solution slightly: 我稍微更改了该解决方案：

df =  pd.read_sql(sql=sql, con=db_eng, index_col=index)
idx = pd.MultiIndex.from_product([df.index.levels[0], df.index.levels[1]], names=df.index.names)
df.reindex(idx).fillna(value=0)

Also found these two questions after posting this: 发布此内容后还发现了这两个问题：

Answer 1

There is probably a better way to do this. 可能有更好的方法来执行此操作。 I created a new MultiIndex using pd.MultiIndex.from_product . 我使用pd.MultiIndex.from_product创建了一个新的pd.MultiIndex.from_product 。 I then created a new dataframe with a dummy value, joined the existing dtaframe, and deleted the dummy column. 然后，我创建了一个带有哑数值的新数据框，加入了现有的dtaframe，并删除了哑列。

df = pd.DataFrame({'index_0': ['a', 'a', 'b', 'b', 'b'], 
                   'index_1': ['A', 'B', 'A', 'B', 'C'], 
                   'vals': [1, 2, 3, 4, 5]}).set_index(['index_0', 'index_1'])

>>> df 
                 vals
index_0 index_1      
a       A           1
        B           2
b       A           3
        B           4
        C           5

idx = pd.MultiIndex.from_product([df.index.levels[0], df.index.levels[1]], 
                                 names=df.index.names)
new_df = pd.DataFrame({'_dummy_': [1] * len(idx)}, index=idx).join(df)
del new_df['_dummy_']

>>> new_df
                 vals
index_0 index_1      
a       A           1
        B           2
        C         NaN
b       A           3
        B           4
        C           5

填写熊猫数据框中的缺失数据

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-02-03 16:32:30

填写熊猫数据框中的缺失数据

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-02-03 16:32:30

解决方案1
2 已采纳 2016-02-03 16:32:30