[英]How do I combine two dataframes by both overlapping columns and indices?
Suppose I have 2 dataframes with overlapping column and index names that look as such: 假设我有两个重叠的列和索引名称的数据框,如下所示:
A B C D
A 0 1 0 1
B 0 1 1 0
C 1 0 1 0
D 0 0 0 1
A C D E
A 1 0 0 0
B 0 1 0 0
D 0 0 0 0
E 1 0 0 1
I want to combine these two dataframes into one such that cells with the same column and index names are combined. 我想将这两个数据帧合并为一个,以便合并具有相同列和索引名称的单元格。 The end result should look like this: 最终结果应如下所示:
A B C D E
A 1 1 0 1 0
B 0 1 1 0 0
C 1 0 1 0 0
D 0 0 0 1 0
E 1 0 0 0 1
I've tried using the Pandas.concat method but it only concatenates along one of the axes. 我尝试使用Pandas.concat方法,但它仅沿其中一个轴连接。
How about: 怎么样:
(df1.add(df2, fill_value=0)
.fillna(0)
.gt(0)
.astype(int))
output: 输出:
A B C D E
A 1 1 0 1 0
B 0 1 1 0 0
C 1 0 1 0 0
D 0 0 0 1 0
E 1 0 0 0 1
align
and np.maximum
align
和np.maximum
pandas.DataFrame.align
will produce a copy of the calling DataFrame
and the argument DataFrame
with their index
and column
attributes aligned and return them as a tuple
of two DataFrame
pandas.DataFrame.align
将产生调用的DataFrame
和参数DataFrame
,其index
和column
属性对齐,并将它们作为两个DataFrame
的tuple
返回 numpy.maximum
which will conveniently respect that these are pandas.DataFrame
objects and return a new DataFrame
with the appropriate maximal values. 将两者都传递给numpy.maximum
,这将方便地认为它们是pandas.DataFrame
对象,并返回具有适当最大值的新DataFrame
。 np.maximum(*df1.align(df2, fill_value=0))
A B C D E
A 1 1 0 1 0
B 0 1 1 0 0
C 1 0 1 0 0
D 0 0 0 1 0
E 1 0 0 0 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.