I have a columns with numbers separated by comma now the values should be split into new columns.
Site UserId
ABC '456,567,67,96'
DEF '67,987'
The new Dataframe should look like:
Site UserID UserId1 UserId2 UserId3 UserId4
ABC '456,567,67,96' 456 567 67 96
DEF '67,987' 67 987
POC '4321,96,912 4321 87 912
Also an empty column next to each column to map the numbers with the name. user
UserId UserName Phone No
4321 EB_Meter 9980688666
987 EB_Meter987 9255488721
912 DG_Meter912 8897634219
567 Ups_Meter567 7263193155
456 Ups_Meter456 8987222112
96 DG_Meter96
67 DGB_Meter
So the final DataFrame is:
Values Value1 Name1 Phone1 Value2 Name2 Value3 Name3 Value4 Name4
'456,567,67,96' 456 Ups_Meter456 8987222112 567 Ups_Meter567 67 DGB_Meter 96 DG_Meter96
'67,987' 67 DGB_Meter 987 EB_Meter987
'4321,96,912 4321 EB_Meter 9980688666 96 DG_Meter96 912 DG_Meter912
Use Series.str.strip
with Series.str.split
for new DataFrame
:
df = df1['UserID'].str.strip("'").str.split(',',expand=True)
print (df)
0 1 2 3
0 456 567 67 96
1 67 987 None None
2 4321 96 912 None
Then convert df2['UserId']
for strings for mapping data reshaped by DataFrame.stack
with Series.map
, then reshape back to DataFrame
by Series.unstack
:
df2['UserId'] = df2['UserId'].astype(str)
s = df2.set_index('UserId')['UserName']
df3 = df.stack(dropna=False).map(s).unstack()
print (df3)
0 1 2 3
0 Ups_Meter456 Ups_Meter567 DGB_Meter DG_Meter96
1 DGB_Meter EB_Meter987 NaN NaN
2 EB_Meter DG_Meter96 DG_Meter912 NaN
Join together by concat
with change order of columns in MultiIndex
by DataFrame.sort_index
, last flatten MultiIndex
in list comprehension with f-string
s and add column df1[['UserID']]
by DataFrame.join
:
df = (pd.concat([df, df3], axis=1, keys=('Value','Name'))
.sort_index(axis=1, level=[1,0], ascending=[True, False]))
df.columns = [f'{x}{y+1}' for x, y in df.columns]
df = df1.join(df)
print (df)
UserID Value1 Name1 Value2 Name2 Value3 \
0 456,567,67,96 456 Ups_Meter456 567 Ups_Meter567 67
1 67,987 67 DGB_Meter 987 EB_Meter987 None
2 4321,96,912 4321 EB_Meter 96 DG_Meter96 912
Name3 Value4 Name4
0 DGB_Meter 96 DG_Meter96
1 NaN None NaN
2 DG_Meter912 None NaN
If necessary replace None/NaN
s to empty strings by DataFrame.fillna
:
df = df.fillna('')
print (df)
UserID Value1 Name1 Value2 Name2 Value3 \
0 456,567,67,96 456 Ups_Meter456 567 Ups_Meter567 67
1 67,987 67 DGB_Meter 987 EB_Meter987
2 4321,96,912 4321 EB_Meter 96 DG_Meter96 912
Name3 Value4 Name4
0 DGB_Meter 96 DG_Meter96
1
2 DG_Meter912
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.