简体   繁体   中英

How to handle missing values by including all types of filling in one function?

I make a naive function to handle the missing values , but unfortunately, the function didn't work.

Here is the example I have tried:

#Basic libraries
import os
import pandas as pd
import numpy as np

def handle_missing(df,list_columns,handle_type,value):
       
    if handle_type == "bfill":
        df = df.loc[:,list_columns].fillna(method='bfill', inplace=True)
                                
    elif handle_type == "ffill":
        df = df.loc[:,list_columns].fillna(method='bfill')
        
    elif handle_type == "mean":
        df = df[list_columns].fillna(df.mean()).round(2)
        
    elif handle_type == "dropna0":
        df = df[list_columns].dropna(axis=0, how='any')
    
    elif handle_type == "dropna1":
        df = df[list_columns].dropna(axis=1, how='any')
        
    else:
        df = df.loc[:,list_columns].fillna(value)

data_dict = {'First':[100, 90, np.nan, 95], 
        'Second': [30, 45, 56, np.nan], 
        'Third':[np.nan, 40, 80, 98]} 
  
df1 = pd.DataFrame(data_dict)
list_columns = ['First','Second','Third']
df1 = handle_missing(df1,list_columns,"bfill",0)
df1

Is there are any issues with this function that I have to take into concentration?

If this function isn't recommended, please suggest to me a new one.

Your function looks fine, except that it returns None , so your df1 is actually None . I would rewrite the function like this:

def handle_missing(df,list_columns,handle_type,value):
       
    if handle_type == "bfill":
        # no inplace
        df[list_columns] = df[list_columns].fillna(method='bfill')
                                
    elif handle_type == "ffill":
        df[list_columns] = df.loc[:,list_columns].fillna(method='bfill')
        
    elif handle_type == "mean":
        df[list_columns] = df[list_columns].fillna(df.mean()).round(2)
        
    elif handle_type == "dropna0":
        df[list_columns] = df[list_columns].dropna(axis=0, how='any')
    
    elif handle_type == "dropna1":
        df[list_columns] = df[list_columns].dropna(axis=1, how='any')
        
    else: 
        df[list_columns] = df.loc[:,list_columns].fillna(value)

Now the function will modify the dataframe inplace . So you just need:

handle_missing(df1,['Third'],"bfill",999)
print(df1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM