Let's say that I have this dataframe :
data3 = ['ID','ID','','','','','']
data4 = [12,34,465,678,896,'','']
data5 = [8798,67,2313,'','','','']
data6 = [56,67,'','','','','']
df2 = pd.DataFrame(list(zip(data3,data4,data5,data6)),columns = ['Name','Data1','Data2','Data3'])
print(df2)
Name Data1 Data2 Data3
0 ID 12 8798 56
1 ID 34 67 67
2 465 2313
3 678
4 896
5
6
I want to fill the column "Name" with always the same value that we can find it and for all the rows where there are values and drop the rows useless where there is nothing. So I would like to get this result :
Name Data1 Data2 Data3
0 ID 12 8798 56
1 ID 34 67 67
2 ID 465 2313
3 ID 678
4 ID 896
Anyone has an idea efficient to do that ?
Thank you
Use DataFrame.replace
if empty strings, not NaN
s, then DataFrame.dropna
and last forward filling missing values in Name
column by ffill
:
df2 = df2.replace('', np.nan)
df2 = df2.dropna(how='all')
df2['Name'] = df2['Name'].ffill()
print(df2)
Name Data1 Data2 Data3
0 ID 12.0 8798.0 56.0
1 ID 34.0 67.0 67.0
2 ID 465.0 2313.0 NaN
3 ID 678.0 NaN NaN
4 ID 896.0 NaN NaN
You can use df.replace
, isna()
with all
to drop Nan
in all rows and fill Nan
with ffill()
:
In [2731]: df2 = df2.replace('', np.nan)
In [2756]: df2 = df2[~df2.isna().all(1)]
In [2733]: df2.Name = df2.Name.ffill()
In [2758]: df2
Out[2758]:
Name Data1 Data2 Data3
0 ID 12.0 8798.0 56.0
1 ID 34.0 67.0 67.0
2 ID 465.0 2313.0 NaN
3 ID 678.0 NaN NaN
4 ID 896.0 NaN NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.