简体   繁体   中英

How to solve the issue of filling all the values with None?

I am trying to fill the missing values in the data frame, but all of the values were replaced with None .

Here is the example I have tried:

# Basic libraries
import os
import pandas as pd
import numpy as np

# Visualization libraries
import matplotlib.pyplot as plt
import seaborn as sns
import folium
#import folium.plugins as plugins
from wordcloud import WordCloud
import plotly.express as px

data_dict = {'First':[100, 90, np.nan, 95], 
        'Second': [30, 45, 56, np.nan], 
        'Third':[np.nan, 40, 80, 98]} 
  
#reating a dataframe from list 
df1 = pd.DataFrame(data_dict)

#first_try_with_column_name
df1.loc[:,'First'] = df1.loc[:,'First'].fillna(method='ffill', inplace=True)

#Second_try_Using_List_of_Columns
list_columns = ['First','Second','Third']
df1.loc[:,list_columns] = df1.loc[:,list_columns].fillna(value, inplace=True)
df1

As shown, I used multiple ways to understand the reason behind this issue, so I tried to use the column name, and then I used a list of column names, but unfortunately, the issue is the same.

Is there any recommendation, please?

change

df1.loc[:,'First'] = df1.loc[:,'First'].fillna(method='ffill', inplace=True)

to

df1.loc[:,'First'].fillna(method='ffill', inplace=True)

this is because you are using inplace=True which means changes will be made to the original dataframe.

As for the None values, they come from the function returning None as it's inplace and there is nothing to return. Hence, all the values become None.


For each column,

for col in df1.columns:
    df1[col].fillna(10, inplace=True)
df1

PS: For the future user, -- avoid inplace because In pandas, is inplace = True considered harmful, or not?

If you want to forward fill you can just do:

df1 = df1.ffill()

This results in:

    First   Second  Third
0   100.0   30.0    NaN
1   90.0    45.0    40.0
2   90.0    56.0    80.0
3   95.0    56.0    98.0

There's still one nan value, so we could do a backfill still:

df1 = df1.bfill()

Final result:

    First   Second  Third
0   100.0   30.0    40.0
1   90.0    45.0    40.0
2   90.0    56.0    80.0
3   95.0    56.0    98.0

If you only want to forward fill na's in specific columns, then use the following. Please note I am NOT using inplace=True . This was the reason why you're code wasn't working before.

columns_to_fillna = ['Second', 'Third']
df1.loc[:, columns_to_fillna] = df1.loc[:, columns_to_fillna].ffill()

If you really want to use inplace=True , which is not be advised, then do:

columns_to_fillna = ['Second', 'Third']
df1.loc[:, columns_to_fillna].ffill(inplace=True)

Reason why inplace is not advised, is discussed here:
https://stackoverflow.com/a/60020384/6366770

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM