Create a new row in a dataframe based only for non NaN values from a column

Question

Lets say i have i dataframe like this one:

      col1       col2         col3
0     data1   Completed       Fail
1     data2   Completed       NaN
2     data3   Completed    Completed
3     data4   Completed       NaN
4     data5      NaN          NaN

How can i add an extra row for each time the value in col3 is not NaN and have a dataframe like this:

      col1     status           
0     data1   Completed 
1     data1      Fail
2     data2   Completed     
3     data3   Completed    
4     data3   Completed
5     data4   Completed      
6     data5      NaN

I tried this but im not getting the desirable output:

df  = df.melt(id_vars=['col1'],  
        value_name="status")

Answer 1

IIUC, you can first start by using pd.melt() as you already did but also drop all the null values by chaining dropna() . This will get you close, but not exactly where you want to be:

new = df.melt(id_vars='col1',value_name='status').sort_values(by='col1').dropna().drop('variable',axis=1)

>>> print(new)

    col1     status
0  data1  Completed
5  data1       Fail
1  data2  Completed
2  data3  Completed
7  data3  Completed
3  data4  Completed

At this point, you will need to bring over the rows from your original df that were nan in col2. You can do that usingisnull() andpd.concat() respectively:

col2_nan = df.loc[df.col2.isnull()].drop('col3',axis=1).rename(columns = {'col2':'status'})

>>> print(pd.concat([new,col2_nan]).reset_index(drop=True))


    col1     status
0  data1  Completed
1  data1       Fail
2  data2  Completed
3  data3  Completed
4  data3  Completed
5  data4  Completed
6  data5        NaN

Create a new row in a dataframe based only for non NaN values from a column

Question

1 answers

solution1
1 ACCPTED 2021-12-15 14:48:34

Create a new row in a dataframe based only for non NaN values from a column

Question

1 answers

solution1 1 ACCPTED 2021-12-15 14:48:34

solution1
1 ACCPTED 2021-12-15 14:48:34