简体   繁体   中英

Pandas bfill and ffill how to use for numeric and non-numeric columns

Some of my NANs are strings and some of my NANs are numeric missing values, how to use bfill and ffill in both cases?

df

Criteria      Col1   Col2   Col3     Col4
Jan10Sales     12      13     NAN     NAN
Feb10Sales     1        3      4      ABC
Mar10Sales      NAN      13     14    XY
Apr10Sales      5      NAN     12      V 
May10Sales      6      18     19      AB

If NaN s are missing values you can pass columns names like list :

cols = ['Col1','Col2','Col3']
df[cols]=df[cols].bfill()

If NaN s are strings first replace strings to numeric with missing values for non numbers:

cols = ['Col1','Col2','Col3']
df[cols]=df[cols].apply(lambda x: pd.to_numeric(x, errors='coerce')).bfill()

If want use your solution:

for col in ['Col1','Col2','Col3']:
    df[col]= pd.to_numeric(df[col], errors='coerce').bfill()

print (df)
     Criteria  Col1  Col2  Col3
0  Jan10Sales  12.0  13.0   4.0
1  Feb10Sales   1.0   3.0   4.0
2  Mar10Sales   5.0  13.0  14.0
3  Apr10Sales   5.0  18.0  12.0
4  May10Sales   6.0  18.0  19.0

But if last rows has missing values, back filling not repalce them, because not exist next non missing value:

print (df)
     Criteria Col1 Col2 Col3
0  Jan10Sales   12   13  NAN
1  Feb10Sales    1    3    4
2  Mar10Sales  NAN   13   14
3  Apr10Sales    5  NAN   12
4  May10Sales    6   18  NaN

cols = ['Col1','Col2','Col3']
df[cols]=df[cols].apply(lambda x: pd.to_numeric(x, errors='coerce')).bfill()
print (df)

     Criteria  Col1  Col2  Col3
0  Jan10Sales  12.0  13.0   4.0
1  Feb10Sales   1.0   3.0   4.0
2  Mar10Sales   5.0  13.0  14.0
3  Apr10Sales   5.0  18.0  12.0
4  May10Sales   6.0  18.0   NaN

Then is possible chain bfill and ffill :

df[cols]=df[cols].apply(lambda x: pd.to_numeric(x, errors='coerce')).bfill().ffill()
print (df)
     Criteria  Col1  Col2  Col3
0  Jan10Sales  12.0  13.0   4.0
1  Feb10Sales   1.0   3.0   4.0
2  Mar10Sales   5.0  13.0  14.0
3  Apr10Sales   5.0  18.0  12.0
4  May10Sales   6.0  18.0  12.0

You may try this:

for cols in ['Col1','Col2','Col3']:
    df[cols].fillna(method='bfill', inplace=True)

pandas.DataFrame.fillna

I guess string 'NAN' does not mean Non-Value Nan, you already got the solution, you can check my code too
df = df[df.ne('NAN')].bfill()

     Criteria Col1 Col2 Col3
0  Jan10Sales   12   13    4
1  Feb10Sales    1    3    4
2  Mar10Sales    5   13   14
3  Apr10Sales    5   18   12
4  May10Sales    6   18   19

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM