Lets say i have i dataframe like this one:
col1 col2 col3
0 data1 Completed Fail
1 data2 Completed NaN
2 data3 Completed Completed
3 data4 Completed NaN
4 data5 NaN NaN
How can i add an extra row for each time the value in col3 is not NaN and have a dataframe like this:
col1 status
0 data1 Completed
1 data1 Fail
2 data2 Completed
3 data3 Completed
4 data3 Completed
5 data4 Completed
6 data5 NaN
I tried this but im not getting the desirable output:
df = df.melt(id_vars=['col1'],
value_name="status")
IIUC, you can first start by using pd.melt()
as you already did but also drop all the null values by chaining dropna()
. This will get you close, but not exactly where you want to be:
new = df.melt(id_vars='col1',value_name='status').sort_values(by='col1').dropna().drop('variable',axis=1)
>>> print(new)
col1 status
0 data1 Completed
5 data1 Fail
1 data2 Completed
2 data3 Completed
7 data3 Completed
3 data4 Completed
At this point, you will need to bring over the rows from your original df
that were nan
in col2. You can do that usingisnull()
andpd.concat()
respectively:
col2_nan = df.loc[df.col2.isnull()].drop('col3',axis=1).rename(columns = {'col2':'status'})
>>> print(pd.concat([new,col2_nan]).reset_index(drop=True))
col1 status
0 data1 Completed
1 data1 Fail
2 data2 Completed
3 data3 Completed
4 data3 Completed
5 data4 Completed
6 data5 NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.