Replace column of pandas multi-index DataFrame with another DataFrame

Question

I have a pandas DataFrame like this:

import pandas as pd
import numpy as np

data1 = np.repeat(np.array(range(3), ndmin=2), 3, axis=0)
columns1 = pd.MultiIndex.from_tuples([('foo', 'a'), ('foo', 'b'), ('bar', 'c')])
df1 = pd.DataFrame(data1, columns=columns1)
print(df1)

  foo    bar
    a  b   c
0   0  1   2
1   0  1   2
2   0  1   2

And another one like this:

data2 = np.repeat(np.array(range(3, 5), ndmin=2), 3, axis=0)
columns2 = ['d', 'e']
df2 = pd.DataFrame(data2, columns=columns2)
print(df2)

   d  e
0  3  4
1  3  4
2  3  4

Now, I would like to replace 'bar' of df1 with df2, but the regular syntax of single-level indexing doesn't seem to work:

df1['bar'] = df2
print(df1)

  foo    bar
    a  b   c
0   0  1 NaN
1   0  1 NaN
2   0  1 NaN

When what I would like to get is:

  foo    bar
    a  b   d  e
0   0  1   3  4
1   0  1   3  4
2   0  1   3  4

I'm not sure if I'm missing something on the syntax or if this is related to the issues described here and here . Could someone explain why this doesn't work and how to get the desired outcome?

I'm using python 2.7 and pandas 0.24, if it makes a difference.

Answer 1

For lack of better alternative, I'm currently doing this:

df2.columns = pd.MultiIndex.from_product([['bar'], df2.columns])
df1.drop(columns='bar', level=0, inplace=True)
df1 = df1.join(df2)

Which gives the desired result. One needs to be cautious though if the order of columns is important, as this approach will likely change it.

Reading further the mentioned issues on Github, I think the reason the approach in the question doesn't work is indeed related to an inconsistency in the pandas API that hasn't been fixed yet.

Replace column of pandas multi-index DataFrame with another DataFrame

Question

1 answers

solution1
1 ACCPTED 2020-01-03 12:22:02

Replace column of pandas multi-index DataFrame with another DataFrame

Question

1 answers

solution1 1 ACCPTED 2020-01-03 12:22:02

solution1
1 ACCPTED 2020-01-03 12:22:02