简体   繁体   English

合并两个数据框并分别添加resp列

[英]Taking the union of two dataframes and add the resp column separately

I have two dataframes and I need to create a sort of union of these two dataframes(of the first column 'A'). 我有两个数据框,我需要创建这两个数据框(第一列“ A”的一种)的并集。

df1 = pd.DataFrame({'A' : ['a','b','c'], 'B' : [10,20,30], 'C' : [2,5,3]})
df2=  pd.DataFrame({'A' : ['a','b','d'], 'B' : [34,21,45], 'C' : [1,5,5]})

output- 输出 -

   A   f_M   d_M  f_B  d_B
0  a   10     2    34   1
1  b   20     5    21   5
2  c   30     3     0   0
3  d   0      0    45   5

This is the new dataframe created which has the union of two dataframes of column('A').The f_M column is if 'B'column has value in the df1 and f_B is for if it is in df2 and 0 if not in any of the df1/df2. 这是创建的新数据框,它具有column('A')的两个数据框的并集.f_M列是'B'列在df1中有值,f_B是在df2中有值,在f_B是在df2中是值,否则是0 df1 / df2的

Use concat with set_index and for avoid duplicated columns names add_suffix : concatset_index一起set_index ,为避免重复的列名add_suffix

d = {'B':'f','C':'d'}
df = (pd.concat([df1.set_index('A').rename(columns=d).add_suffix('_M'), 
                df2.set_index('A').rename(columns=d).add_suffix('_B')], axis=1)
        .fillna(0)
        )
print (df)
    f_M  d_M   f_B  d_B
a  10.0  2.0  34.0  1.0
b  20.0  5.0  21.0  5.0
c  30.0  3.0   0.0  0.0
d   0.0  0.0  45.0  5.0

And for column: 对于列:

d = {'B':'f','C':'d'}
df = (pd.concat([df1.set_index('A').rename(columns=d).add_suffix('_M'), 
                df2.set_index('A').rename(columns=d).add_suffix('_B')], axis=1)
        .fillna(0)
        .rename_axis('A')  
        .reset_index()
        )
print (df)
   A   f_M  d_M   f_B  d_B
0  a  10.0  2.0  34.0  1.0
1  b  20.0  5.0  21.0  5.0
2  c  30.0  3.0   0.0  0.0
3  d   0.0  0.0  45.0  5.0

This looks like a classic use case for an outer merge : 这看起来像是外部merge的经典用例:

df1.merge(df2, on='A', how='outer', suffixes=('_M', '_B')).fillna(0)

   A   B_M  C_M   B_B  C_B
0  a  10.0  2.0  34.0  1.0
1  b  20.0  5.0  21.0  5.0
2  c  30.0  3.0   0.0  0.0
3  d   0.0  0.0  45.0  5.0

To account for the columns, use map with DataFrame.rename : 要说明这些列, DataFrame.rename mapDataFrame.rename一起DataFrame.rename

pd.merge(*map(
      lambda x: x.rename(columns={'B':'f','C':'d'}), (df1, df2)
   ), on='A', how='outer', suffixes=('_M', '_B')
).fillna(0)

   A   f_M  d_M   f_B  d_B
0  a  10.0  2.0  34.0  1.0
1  b  20.0  5.0  21.0  5.0
2  c  30.0  3.0   0.0  0.0
3  d   0.0  0.0  45.0  5.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM