简体   繁体   中英

Check for differences between the columns of two pandas data frames side by side

I thought this solution would solve my problem but the op here needed to check if the rows of his two data frames contained a difference. I want to do the same but for the columns. The solution was ne = (df1 != df2).any(1) but that does not help with my columns. Yes, I just checked and both of my dataframes have exactly the same shape . If I do df1 == df2 it gives me a new data frame full of trues and falses. Looking at the first hundred rows it looks like most of the columns with a few exceptions are equal. How can you just get one True / False for each column?

Here is a toy example:

import numpy as np
import pandas as pd
df1 = pd.DataFrame(np.random.randint(low=0, high=10, size=(5, 5)),  columns=['a', 'b', 'c', 'd', 'e'])
df2 = df1.copy()
df2.at[3,'d'] += 10

DF1

DF2

Desired output:

A True
B True
C True
D False
E True

Use DataFrame.all for check if all values per rows are True s:

print ((df1 == df2).all())
a     True
b     True
c     True
d    False
e     True
dtype: bool

Detail:

print (df1 == df2)

      a     b     c      d     e
0  True  True  True   True  True
1  True  True  True   True  True
2  True  True  True   True  True
3  True  True  True  False  True
4  True  True  True   True  True

Solution with any is possible also, only need invert output by ~ :

print (~((df1 != df2).any()))

a     True
b     True
c     True
d    False
e     True
dtype: bool

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM