如何将两个数据帧中的两列合并为一个新数据帧（pandas）的一列？

Question

I want to merge the values of two different columns of pandas dataframe into one column of new dataframe. 我想将两个不同的pandas dataframe列的值合并到一个新数据帧列中。

pandas df1 =         

        hapX
  pos   0.0
1 721   0.2
2 735   0.5
3 739   1.0


pandas df2 =       

        hapY
  pos   0.1
1 721   0.0
2 735   0.6
3 739   1.5

I want to generate a new dataframe like: 我想生成一个新的数据帧，如：

  df_joined['hapX|Y'] = df1.astype(str).add('|').add(df2.astype(str))

with expected output : 预期产量 ：

        hapX|Y
  pos   0.0|0.1
1 721   0.2|0.0
2 735   0.5|0.6
3 739   1.0|1.5

But, this is outputting bunch of NaN 但是，这是输出一堆NaN

        hapX    hapY
  pos   NaN      NaN
1 721   NaN      NaN
2 735   NaN      NaN
3 739   NaN      NaN

Is the problem with value being float (i don't think so). 值是浮动的问题（我不这么认为）。 What is the problem with my approach? 我的方法有什么问题？

Also, is there a way to automate the process if columns values are like hapX1 hapX1 hapX3 in one dataframe with hapY1 hapY2 hapY3 in another dataframe? 另外，如果列值在一个数据帧中像hapX1 hapX1 hapX3 hapY1 hapY2 hapY3在另一个数据帧中有hapY1 hapY2 hapY3 ，那么有没有办法自动化这个过程？

Thanks, 谢谢，

Answer 1

You can merge the two dataframes and then concat the hapX and hapY. 您可以合并两个数据帧，然后连接hapX和hapY。 Say your first column name is no. 假设您的第一个列名称为no。

df_joined = df1.merge(df2, on = 'no')
df_joined['hapX|Y'] = (df_joined['hapX'].astype(str))+'|'+(df_joined['hapY'].astype(str))
df_joined.drop(['hapX', 'hapY'], axis = 1)

This gives you 这给了你

    no  hapX|Y
0   pos 0.0|0.1
1   721 0.2|0.0
2   735 0.5|0.6
3   739 1.0|1.5

Answer 2

Just to add onto the previous answer, for the general case of N DataFrames, 只是为了添加到上一个答案，对于N DataFrames的一般情况，

Suppose you have a number of DataFrames as follows: 假设您有许多DataFrame，如下所示：

dfs = [pd.DataFrame({'hapY'+str(j): [random.random() for i in range(10)]}) for j in range(5)]

such that 这样的

>>> dfs[0]
      hapY0
0  0.175683
1  0.353729
2  0.949848
3  0.346088
4  0.435292
5  0.837879
6  0.277274
7  0.623121
8  0.325119
9  0.709252

Then, 然后，

>>> map( lambda m: '|'.join(m) , zip(*[ dfs[j]['hapY'+str(j)].astype(str)  for j in range(5)]))
['0.0845464936138|0.193336164837|0.551717121013|0.113566029656|0.479590342798',
 '0.275851474238|0.694161791339|0.151607726092|0.615367668451|0.498997567849',
 '0.116891472119|0.258406028668|0.315137581816|0.819992354178|0.864412473301',
 '0.729581942312|0.614902776003|0.443986436146|0.227782256619|0.0149481683863',
 '0.745583477173|0.441456815889|0.428691631831|0.307480112319|0.136790112739',
 '0.981337451224|0.0117895017035|0.415140979617|0.650957722911|0.968082350568',
 '0.725618728314|0.0546057041356|0.715910454674|0.0828229441557|0.220878025678',
 '0.704047455894|0.303403129266|0.0499082759635|0.49727194707|0.251623048104',
 '0.453595354131|0.146042134766|0.346665276655|0.911092176243|0.291405609407',
 '0.140523603089|0.117930249858|0.902071673051|0.0804933425857|0.876006332635']

which you can later put into a DataFrame. 您可以稍后将其放入DataFrame中。

Answer 3

I think the simpliest is rename columns by dict which can be created by dict comprehension , last add_suffix : 我认为最简单的是通过dict重命名列，可以通过dict comprehension创建，最后一个add_suffix ：

print (df1) 
     hapX1  hapX2  hapX3  hapX4
pos                            
23     1.0    0.0    1.0    1.0
24     1.0    1.0    1.5    1.0
28     1.0    0.0    0.5    0.0

print (df2)
     hapY1  hapY2  hapY3  hapY4
pos                            
23     0.0    1.0    0.5    0.0
24     1.0    1.0    1.5    1.0
28     0.0    1.0    1.0    1.0

d = {'hapY' + str(x):'hapX' + str(x) for x in range(1,5)}
print (d)
{'hapY1': 'hapX1', 'hapY3': 'hapX3', 'hapY2': 'hapX2', 'hapY4': 'hapX4'}

df_joined = df1.astype(str).add('|').add(df2.rename(columns=d).astype(str)).add_suffix('|Y')
print (df_joined) 

     hapX1|Y  hapX2|Y  hapX3|Y  hapX4|Y
pos                                    
23   1.0|0.0  0.0|1.0  1.0|0.5  1.0|0.0
24   1.0|1.0  1.0|1.0  1.5|1.5  1.0|1.0
28   1.0|0.0  0.0|1.0  0.5|1.0  0.0|1.0

如何将两个数据帧中的两列合并为一个新数据帧（pandas）的一列？

问题描述

3 个解决方案

解决方案1
1 已采纳 2017-03-13 02:13:48

解决方案2
1 2017-03-13 02:23:54

解决方案3
1 2017-03-13 06:25:32

如何将两个数据帧中的两列合并为一个新数据帧（pandas）的一列？

问题描述

3 个解决方案

解决方案1 1 已采纳 2017-03-13 02:13:48

解决方案2 1 2017-03-13 02:23:54

解决方案3 1 2017-03-13 06:25:32

解决方案1
1 已采纳 2017-03-13 02:13:48

解决方案2
1 2017-03-13 02:23:54

解决方案3
1 2017-03-13 06:25:32