Check conditions to concat value in dataframe (Python)

Question

Lets say I have a dataframe.

ID	B.	C.	D.	E
1.	b1_main	null	d_value	e_value
2.	b2_main	null	null	e_value

The logic that I would want to apply concat value in B column to either value in C, D or E column. However, C will always take the first priority to concat with value in B column, if value in C column is null then it will only proceed to concat value in D column and proceed to E column if value in D column is also null.

Desired Output

ID	B.	C.	D.	E
1.	b1_main, d_value	null	d_value	e_value
2.	b2_main, e_value	null	null	e_value

The code that i tried is below however it will concat all the values in C,D and E and remove null value.

df['B'] = pb6_branded[['B','C', 'D', 'E']].apply(lambda x: ','.join(x.dropna()), axis=1)

Thank you.

Answer 1

Use back filling missing values by columns 'C', 'D', 'E' and select first column (here C ) and add to B column:

df = df.replace('null', np.nan)
df['B'] = df['B'] + ', ' + df[['C', 'D', 'E']].bfill(axis=1).iloc[:, 0]
#selecting by colum name
#df['B'] = df['B'] + ', ' + df[['C', 'D', 'E']].bfill(axis=1)['C']
print (df)
    ID                 B   C        D        E
0  1.0  b1_main, d_value NaN  d_value  e_value
1  2.0  b2_main, e_value NaN      NaN  e_value

If possible all C, D, E filled by empty values, is possible use Series.str.cat :

df = df.replace('null', np.nan)
s = df[['C', 'D', 'E']].bfill(axis=1)['C']
df['B'] = df['B'].str.cat(s, na_rep='', sep=', ').str.strip(' ,')

Check conditions to concat value in dataframe (Python)

Question

1 answers

solution1
1 ACCPTED 2021-05-06 05:26:56

Check conditions to concat value in dataframe (Python)

Question

1 answers

solution1 1 ACCPTED 2021-05-06 05:26:56

solution1
1 ACCPTED 2021-05-06 05:26:56