简体   繁体   中英

Merge and create multiple columns based in the number of columns present in the dataframe- Pandas

I have a columns with numbers separated by comma now the values should be split into new columns.

 Site       UserId
   ABC           '456,567,67,96'
   DEF           '67,987'
 

The new Dataframe should look like:

Site     UserID              UserId1  UserId2  UserId3  UserId4
ABC     '456,567,67,96'      456       567      67        96
DEF     '67,987'             67        987
POC     '4321,96,912         4321      87       912  

Also an empty column next to each column to map the numbers with the name and Phone Number. user

 UserId UserName         PhoneNo 
  4321   EB_Meter         9980688666
    987    EB_Meter987    9255488721 
    912    DG_Meter912    8897634219
    567    Ups_Meter567   7263193155 
    456    Ups_Meter456   8987222112 
    96     DG_Meter96     
    67     DGB_Meter

So the final DataFrame is:

  Values              Value1  Name1            Phone1         Value2   Name2        Phone2       Value3  Name3        Phone3    Value4 Name4  Phone 4
 '456,567,67,96'      456     Ups_Meter456    8987222112      567      Ups_Meter567  7263193155     67      DGB_Meter               96   DG_Meter96
 '67,987'             67      DGB_Meter                        987      EB_Meter987   9255488721
 '4321,96,912         4321    EB_Meter          9980688666    96       DG_Meter96                  912    DG_Meter912  8897634219

Here are added multiple columns per UserId , so instead map is used melt with left join in merge , reshaping is created by DataFrame.pivot :

df2['UserId'] = df2['UserId'].astype(str)
df3 = df1['UserId'].str.strip("'").str.split(',',expand=True)

df3 = (df3.reset_index()
          .melt('index', value_name='UserId')
          .merge(df2, on='UserId', how='left')
          .pivot(index='index', columns='variable')
          .sort_index(axis=1, level=1, sort_remaining=False)
          )
df3.columns = df3.columns.map(lambda x: f'{x[0]}_{x[1] + 1}')

df = df1.join(df3)
print (df)
  Site         UserId UserId_1    UserName_1   PhoneNo_1 UserId_2  \
0  ABC  456,567,67,96      456  Ups_Meter456  8987222112      567   
1  DEF         67,987       67     DGB_Meter         NaN      987   

     UserName_2   PhoneNo_2 UserId_3 UserName_3 PhoneNo_3 UserId_4  \
0  Ups_Meter567  7263193155       67  DGB_Meter       NaN       96   
1   EB_Meter987  9255488721     None        NaN       NaN     None   

   UserName_4 PhoneNo_4  
0  DG_Meter96       NaN  
1         NaN       NaN  

    

您可以使用:

df[[ 'UserId1', 'UserId2', 'UserId3', 'UserId4']] = df['UserId'].str.split(",", expand=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM