简体   繁体   中英

Pandas fillna() not filling values from series

I'm trying to fill missing values in a column in a DataFrame with the value from another DataFrame's column. Here's the setup:

import numpy as np
import pandas as pd

df = pd.DataFrame({
    'a': [2, 3, 5, np.nan, np.nan],
    'b': [10, 11, 13, 14, 15]
})

df2 = pd.DataFrame({
    'x': [1]
})

I can of course do this and it works:

df['a'] = df['a'].fillna(1)

However, this results in the missing values not being filled:

df['a'] = df['a'].fillna(df2['x'])

And this results in an error:

df['a'] = df['a'].fillna(df2['x'].values)

How can I use the value from df2['x'] to fill in missing values in df['a'] ?

If you can guarantee df2['x'] only has a single element, then use .item :

df['a'] = df['a'].fillna(df2.values.item())

Or,

df['a'] = df['a'].fillna(df2['x'].item())

df
     a   b
0  2.0  10
1  3.0  11
2  5.0  13
3  1.0  14
4  1.0  15

Otherwise, this isn't possible unless they're either the same length and/or index-aligned.

As a rule of thumb; either

  1. pass a scalar, or
  2. pass a dictionary mapping the index of the NaN value to its replacement value (eg, df.a.fillna({3 : 1, 4 : 1}) ), or
  3. index aligned series

I think one general solution is select first value by [0] for scalar:

print (df2['x'].values[0])
1

df['a'] = df['a'].fillna(df2['x'].values[0])
#similar solution for select by loc
#df['a'] = df['a'].fillna(df2.loc[0, 'x'])
print (df)
     a   b
0  2.0  10
1  3.0  11
2  5.0  13
3  1.0  14
4  1.0  15

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM