I have a dataframe that looks like the following:
Colm1 ColmX Colm2
0 1 2 3
1 4 5 6
I need a new one as follows:
Colm1 ColmX Colm2 Colm3
0 1 2 3 Colm1_1_Colm2_3
1 4 5 6 Colm1_4_Colm2_6
The merged value in Colm3 is constructed as an underscore separated list of pair of a specific list of columns, in this case, [Colm1, Colm2]
How do I go about doing this? I have a list of column names that I need to merge as above, to start off with. Thank you!
A stupid solution:
In [62]: df['Colm3'] = 'Colm1_'+df['Colm1'].astype(str)+'_Colm2_'+df['Colm2'].astype(str)
In [63]: df
Out[63]:
Colm1 ColmX Colm2 Colm3
0 1 2 3 Colm1_1_Colm2_3
1 4 5 6 Colm1_4_Colm2_6
A bit more generic solution:
cols=['Colm1','Colm2']
df['new'] = df[cols].apply(lambda x: x.name+'_'+x.astype(str)).add('_').sum(1).str.rstrip('_')
Detailed:
In [4]: df[cols].apply(lambda x: x.name+'_'+x.astype(str))
Out[4]:
Colm1 Colm2
0 Colm1_1 Colm2_3
1 Colm1_4 Colm2_6
In [5]: df[cols].apply(lambda x: x.name+'_'+x.astype(str)).add('_')
Out[5]:
Colm1 Colm2
0 Colm1_1_ Colm2_3_
1 Colm1_4_ Colm2_6_
In [6]: df[cols].apply(lambda x: x.name+'_'+x.astype(str)).add('_').sum(1)
Out[6]:
0 Colm1_1_Colm2_3_
1 Colm1_4_Colm2_6_
dtype: object
In [7]: df[cols].apply(lambda x: x.name+'_'+x.astype(str)).add('_').sum(1).str.rstrip('_')
Out[7]:
0 Colm1_1_Colm2_3
1 Colm1_4_Colm2_6
dtype: object
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.