Alternative to nested np.where statements to retain NaN values while creating a new pandas boolean column based on two other existing columns

Question

I'm trying to figure out a more straightforward alternative for evaluating and creating a new column in a pandas dataframe based on two other columns that contain either True, False, or NaN values. I want the new column to evaluate as follows relative to the two reference columns:

If either True -> True
If at least one False and neither True -> False
If both NaN -> NaN

I've figured out a solution using several nested np.where statements, but would prefer a more straightforward approach. For a single reference column, I figured out how to doing this (see shown below as col4), but can't figure out if there's a way to adapt this to factoring in multiple reference columns.

Current Solution:

import pandas as pd
import numpy as np

d = {'col1': [True, True, True, False, False, False, np.nan, np.nan, np.nan],
     'col2': [True, False, np.nan,True, False, np.nan,True, False, np.nan]}
df = pd.DataFrame(data=d)

df['col3'] = np.where(
    pd.notnull(df['col1']) & pd.notnull(df['col2']),
    (df['col1'] == True) | (df['col2'] == True),
    np.where(
        pd.isnull(df['col1']) & pd.isnull(df['col2']),
        np.nan,
        np.where(pd.notnull(df['col1']),df['col1'],df['col2'])
    )
)

Single Reference Column Solution:

df['col4'] = df['col1'].map(lambda x: x, na_action='ignore')

Answer 1

np.select() is made for this type of job:

df['col3'] = pd.Series(np.select(
    [(df.col1 == True) | (df.col2 == True), (df.col1 == False) | (df.col2 == False)],
    [True, False], np.array(np.nan, object)))

Or, using only Pandas, but I think this way is less readable:

df['col3'] = df.col1.where(df.col1, df.col2.where(df.col2.notnull(), df.col1))

Alternative to nested np.where statements to retain NaN values while creating a new pandas boolean column based on two other existing columns

Question

1 answers

solution1
1 ACCPTED 2019-07-06 00:53:51

Alternative to nested np.where statements to retain NaN values while creating a new pandas boolean column based on two other existing columns

Question

1 answers

solution1 1 ACCPTED 2019-07-06 00:53:51

solution1
1 ACCPTED 2019-07-06 00:53:51