How to update column with multiple conditions taking a corresponding value from another column Python Pandas

Question

The output is more like the following SQL statement.

   UPDATE table_A SET final=(cs+fhfa+sz)/3 WHERE cs IS NOT NULL AND fhfa IS NOT NULL AND sz IS NOT NULL;

Here cs+fhfa+sz are all individual columns in the sql table ( and in dataframe)

If I want to convert this SQL statement to pandas operation in python, this will be more like :

   df['div_3'] = (df.cs+df.fhfa+df.sz) /3
   df['final'] = df.loc[(df['cs'] != None) & (df['fhfa'] != None) & (df['sz'] != None) ] = df['div_3']

But this does not guarantee "corresponding values" being put finally. How to achieve this??

Do i really need to create another column div_3 with all the sum of 3 columns? Can this be done without another column creation?

Answer 1

Filter on pd.Series.notnull and call mean .

c = ['cs', 'fhfa', 'sz']
df['final'] = df[df[c].notnull().all(1)][c].mean(1)

Answer 2

IIUC:

df.loc[:, 'final'] = df.loc[df[['cs','fhfa','sz']].notnull().all(1), ['cs','fhfa','sz']].sum(1)/3

.all(1) - is the same as .all(axis=1) , which means - all values in each row must be True