fill new column of pandas DataFrame based on if-else of other columns

Question

I have a situation where I want to create a new column in a Pandas DataFrame and populate it according to conditions involving 2 other columns. In this example:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.array([['value1','value2'],['value',np.NaN],[np.NaN,np.NaN]]), columns=['col1','col2'])

I would like to create a new column, 'new col', which consists of 1) the value in 'col2' if it is not NaN else, 2) the value in 'col1' if it is not NaN else, 3) NaN

I am trying this function with .apply() but it is not returning the desired result

def singleval(row):
    if row['col2'] != np.NaN:
        val = row['col2']
    elif row['col1'] != np.NaN:
        val = row['col1']
    else:
        val = np.NaN
    return val

df['new col'] = df.apply(singleval,axis=1)

i want the values in 'new col' to be ['value2', 'value', 'nan']

Answer 1

Method 1 `fillna`

In this case, we can simply use fillna on col2 with values from col1 :

df['new col'] = df['col2'].fillna(df['col1'])

     col1    col2 new col
0  value1  value2  value2
1   value     NaN   value
2     NaN     NaN     NaN

Method 2 `np.select`

If you have multiple conditions, use np.select which you pass a list of conditions and based on those conditions you pass it choices:

conditions = [
    df['col2'].notnull(),
    df['col1'].notnull(),
]

choices=[df['col2'], df['col1']]

df['new col'] = np.select(conditions, choices, default=np.NaN)

     col1    col2 new col
0  value1  value2  value2
1   value     NaN   value
2     NaN     NaN     NaN

Note

Your dataframe wasn't correct with the NaN , use this one instead to test:

df = pd.DataFrame({'col1':['value1', 'value', np.NaN],
                   'col2':['value2', np.NaN, np.NaN]})

Edit: why was the function not working?

np.NaN == np.NaN will return False
while np.NaN is np.NaN will return True .

See this question for the explanation of this.

So to fix your function you have to use is not :

def singleval(row):
    if row['col2'] is not np.NaN:
        val = row['col2']
    elif row['col1'] is not np.NaN:
        val = row['col1']
    else:
        val = np.NaN
    return val

df['new col'] = df.apply(singleval, axis=1)

     col1    col2 new col
0  value1  value2  value2
1   value     NaN   value
2     NaN     NaN     NaN

Answer 2

Try this:

df['col3'] = df[['col1','col2']].stack().groupby(level=0).last()

output:

    col1    col2    col3
0   value1  value2  value2
1   value   nan     value
2   nan     nan     nan

Answer 3

Use df.ffill on axis=1

df['new_col'] = df.ffill(1).col2

Out[1318]:
     col1    col2 new_col
0  value1  value2  value2
1   value     NaN   value
2     NaN     NaN     NaN

fill new column of pandas DataFrame based on if-else of other columns

Question

3 answers

solution1
2 ACCPTED 2019-05-13 23:20:11

Method 1 `fillna`

Method 2 `np.select`

solution2
0 2019-05-13 23:18:44

solution3
0 2019-05-14 01:26:13

fill new column of pandas DataFrame based on if-else of other columns

Question

3 answers

solution1 2 ACCPTED 2019-05-13 23:20:11

Method 1 fillna

Method 2 np.select

solution2 0 2019-05-13 23:18:44

solution3 0 2019-05-14 01:26:13

solution1
2 ACCPTED 2019-05-13 23:20:11

Method 1 `fillna`

Method 2 `np.select`

solution2
0 2019-05-13 23:18:44

solution3
0 2019-05-14 01:26:13