Name | place | pers_data |
---|---|---|
NaN | NaN | Nan |
Smith John | NY | sjohn@gmail.com |
NaN | Nan | 0987 4567 |
NaN | NaN | 0653 6734 |
Vic Stied | SA | 0986 5332 |
NaN | NaN | vickie@hotmail.com |
I would like to delete the NaN values and reformat the file like the following:
Name | Place | pers_data | other | other_2 |
---|---|---|---|---|
Smith John | NY | sjohn@gmail.com | 0987 4567 | 0653 6734 |
Vic Stied | SA | vickie@hotmail.com | 0986 5332 |
Can someone help me with that, I tried some stuff but without understanding anything, I'd like to really get what I am doing.
This is a variation on a pivot
:
idx = df['Name'].notna().cumsum()
out = (df
.assign(col=df.groupby(idx).cumcount(),
Name=df['Name'].groupby(idx).ffill(),
place=df['place'].groupby(idx).ffill()
)
.pivot(index=['Name', 'place'], columns='col', values='pers_data')
.add_prefix('other_').rename(columns={'other_0': 'pers_data'})
.reset_index().rename_axis(columns=None)
.dropna(how='all')
)
output:
Name place pers_data other_1 other_2
1 Smith John NY sjohn@gmail.com 0987 4567 0653 6734
2 Vic Stied SA 0986 5332 vickie@hotmail.com NaN
df1.loc[~df1.isna().all(axis=1)].fillna(method='ffill')\
.groupby(['Name','place']).agg(','.join)\
.pers_data.str.split(',',expand=True).add_prefix('other_')\
.rename(columns={'other_0':'pers_data'}).reset_index()
Name place pers_data other_1 other_2
0 Smith John NY sjohn@gmail.com 0987 4567 0653 6734
1 Vic Stied SA 0986 5332 vickie@hotmail.com None
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.