简体   繁体   English

如何通过重叠的列和索引合并两个数据帧?

[英]How do I combine two dataframes by both overlapping columns and indices?

Suppose I have 2 dataframes with overlapping column and index names that look as such: 假设我有两个重叠的列和索引名称的数据框,如下所示:

  A B C D
A 0 1 0 1
B 0 1 1 0
C 1 0 1 0
D 0 0 0 1

  A C D E
A 1 0 0 0
B 0 1 0 0
D 0 0 0 0
E 1 0 0 1

I want to combine these two dataframes into one such that cells with the same column and index names are combined. 我想将这两个数据帧合并为一个,以便合并具有相同列和索引名称的单元格。 The end result should look like this: 最终结果应如下所示:

  A B C D E
A 1 1 0 1 0
B 0 1 1 0 0
C 1 0 1 0 0
D 0 0 0 1 0
E 1 0 0 0 1

I've tried using the Pandas.concat method but it only concatenates along one of the axes. 我尝试使用Pandas.concat方法,但它仅沿其中一个轴连接。

How about: 怎么样:

(df1.add(df2, fill_value=0)
    .fillna(0)
    .gt(0)
    .astype(int))

output: 输出:

    A   B   C   D   E
A   1   1   0   1   0
B   0   1   1   0   0
C   1   0   1   0   0
D   0   0   0   1   0
E   1   0   0   0   1

align and np.maximum alignnp.maximum

  • pandas.DataFrame.align will produce a copy of the calling DataFrame and the argument DataFrame with their index and column attributes aligned and return them as a tuple of two DataFrame pandas.DataFrame.align将产生调用的DataFrame和参数DataFrame ,其indexcolumn属性对齐,并将它们作为两个DataFrametuple返回
  • Pass both to numpy.maximum which will conveniently respect that these are pandas.DataFrame objects and return a new DataFrame with the appropriate maximal values. 将两者都传递给numpy.maximum ,这将方便地认为它们是pandas.DataFrame对象,并返回具有适当最大值的新DataFrame

np.maximum(*df1.align(df2, fill_value=0))

   A  B  C  D  E
A  1  1  0  1  0
B  0  1  1  0  0
C  1  0  1  0  0
D  0  0  0  1  0
E  1  0  0  0  1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM