Lets say I have a dataframe.
ID | B. | C. | D. | E |
---|---|---|---|---|
1. | b1_main | null | d_value | e_value |
2. | b2_main | null | null | e_value |
The logic that I would want to apply concat value in B column to either value in C, D or E column. However, C will always take the first priority to concat with value in B column, if value in C column is null then it will only proceed to concat value in D column and proceed to E column if value in D column is also null.
Desired Output
ID | B. | C. | D. | E |
---|---|---|---|---|
1. | b1_main, d_value | null | d_value | e_value |
2. | b2_main, e_value | null | null | e_value |
The code that i tried is below however it will concat all the values in C,D and E and remove null value.
df['B'] = pb6_branded[['B','C', 'D', 'E']].apply(lambda x: ','.join(x.dropna()), axis=1)
Thank you.
Use back filling missing values by columns 'C', 'D', 'E'
and select first column (here C
) and add to B
column:
df = df.replace('null', np.nan)
df['B'] = df['B'] + ', ' + df[['C', 'D', 'E']].bfill(axis=1).iloc[:, 0]
#selecting by colum name
#df['B'] = df['B'] + ', ' + df[['C', 'D', 'E']].bfill(axis=1)['C']
print (df)
ID B C D E
0 1.0 b1_main, d_value NaN d_value e_value
1 2.0 b2_main, e_value NaN NaN e_value
If possible all C, D, E
filled by empty values, is possible use Series.str.cat
:
df = df.replace('null', np.nan)
s = df[['C', 'D', 'E']].bfill(axis=1)['C']
df['B'] = df['B'].str.cat(s, na_rep='', sep=', ').str.strip(' ,')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.