I want to transpose pandas.Dataframe
into transposed tabular format using pandas
functionality So all the phone numbers should be mentioned under column MSISD
and play_id
should have values of column names if its phone1 or phone2 or so on.
df is
df = pd.DataFrame({
'id': ['1', '2', '3'],
'play_id': ['20002075', '601731', '601731'],
'phone1': ['0900031349', '', ''],
'phone2': ['090891349', '', ''],
'phone3': ['', '', ''],
'phone4': ['', '', ''],
'phone5': ['', '088235311', ''],
'phone6': ['', '', ''],
'phone7': ['', '', '088235311']
})
Expected output should be
id play_id msisd
1: 1 phone1 0900031349
2: 2 phone2 090891349
Use DataFrame.melt
with remove values with empty strings by boolean indexing
:
df1 = df.melt(['id','play_id'], value_name='val', var_name='phone')
df1 = df1[df1['val'] != '']
#if empty strings are NANs
#df1 = df1[df1['val'].notna()]
print (df1)
id play_id phone val
0 1 20002075 phone1 0900031349
3 1 20002075 phone2 090891349
13 2 601731 phone5 088235311
20 3 601731 phone7 088235311
Or use DataFrame.stack
with replace empty strings to missing values:
df1 = (df.set_index(['id','play_id'])
.replace('', np.nan)
.stack()
.reset_index(name='val')
.rename(columns={'level_2':'phone'})
)
print (df1)
id play_id phone val
0 1 20002075 phone1 0900031349
1 1 20002075 phone2 090891349
2 2 601731 phone5 088235311
3 3 601731 phone7 088235311
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.