merging multiple dataframes with the same columns and remove nans

Question

Suppose I have the following main df:

df = pd.DataFrame({'name':['Sara',  'John', 'Christine']})

df:

    name
0   Sara
1   John
2   Christine

Now I have 4 other dfs with age and grade for the 3 usernames but with different NaN arrangement:

df2 = pd.DataFrame({'name':['Sara',  'John', 'Christine'],

               'age': [26, 30, np.nan]})

df3:

df3 = pd.DataFrame({'name':    ['Sara',  'John', 'Christine'],

                   'age': [np.nan, 30, 24]})

df4:

df4 = pd.DataFrame({'name':    ['Sara',  'John', 'Christine'],

                   'grade': [np.nan, 1, 3]})

df5:

df5 = pd.DataFrame({'name':    ['Sara',  'John', 'Christine'],

                   'grade': [12, np.nan, 3]})

I want to merge the data from the 4 dataframes to the main df on name column and remove NaNs.

What I did so far:

Created a list of dfs:

dfs = [df,df2,df3,df4,df5]

used reduce :

from functools import reduce

df_final = reduce(lambda left,right: pd.merge(left,right,on='name'), dfs)

df_final:

    name          age_x     age_y   grade_x     grade_y
0   Sara           26.0      NaN      NaN       12.0
1   John           30.0      30.0     1.0       NaN
2   Christine       NaN      24.0     3.0       3.0

Expected output:

df_final:

    name          age        grade
0   Sara          26.0        12        
1   John          30.0        1.0       
2   Christine     24.0        3.0

Answer 1

We can try merging long with concat then using groupby first to retrieve the first valid entry for each column per name:

merged = (
    pd.concat(dfs).groupby('name', sort=False, as_index=False).first()
)

merged :

        name   age  grade
0       Sara  26.0   12.0
1       John  30.0    1.0
2  Christine  24.0    3.0

merging multiple dataframes with the same columns and remove nans

Question

1 answers

solution1
1 ACCPTED 2021-07-22 03:15:24

merging multiple dataframes with the same columns and remove nans

Question

1 answers

solution1 1 ACCPTED 2021-07-22 03:15:24

solution1
1 ACCPTED 2021-07-22 03:15:24