Pandas fillna() not filling values from series

Question

I'm trying to fill missing values in a column in a DataFrame with the value from another DataFrame's column. Here's the setup:

import numpy as np
import pandas as pd

df = pd.DataFrame({
    'a': [2, 3, 5, np.nan, np.nan],
    'b': [10, 11, 13, 14, 15]
})

df2 = pd.DataFrame({
    'x': [1]
})

I can of course do this and it works:

df['a'] = df['a'].fillna(1)

However, this results in the missing values not being filled:

df['a'] = df['a'].fillna(df2['x'])

And this results in an error:

df['a'] = df['a'].fillna(df2['x'].values)

How can I use the value from df2['x'] to fill in missing values in df['a'] ?

Answer 1

If you can guarantee df2['x'] only has a single element, then use .item :

df['a'] = df['a'].fillna(df2.values.item())

Or,

df['a'] = df['a'].fillna(df2['x'].item())

Otherwise, this isn't possible unless they're either the same length and/or index-aligned.

As a rule of thumb; either

pass a scalar, or
pass a dictionary mapping the index of the NaN value to its replacement value (eg, df.a.fillna({3 : 1, 4 : 1}) ), or
index aligned series

Answer 2

I think one general solution is select first value by [0] for scalar:

print (df2['x'].values[0])
1

df['a'] = df['a'].fillna(df2['x'].values[0])
#similar solution for select by loc
#df['a'] = df['a'].fillna(df2.loc[0, 'x'])
print (df)
     a   b
0  2.0  10
1  3.0  11
2  5.0  13
3  1.0  14
4  1.0  15

Pandas fillna() not filling values from series

Question

2 answers

solution1
3 2018-03-08 15:53:41

solution2
1 ACCPTED 2018-03-08 15:53:35

Pandas fillna() not filling values from series

Question

2 answers

solution1 3 2018-03-08 15:53:41

solution2 1 ACCPTED 2018-03-08 15:53:35

solution1
3 2018-03-08 15:53:41

solution2
1 ACCPTED 2018-03-08 15:53:35