简体   繁体   中英

Python comparing variables in 2 dataframe

I am trying to compare the values of 2 data frame. If any of the values is different, I will return that row.

df1
     Value1 Value2
Name     
a      1      1
b      1      2
c      0      1

df2
     Value Value2
Name     
a      1      1
b      1      1
c      1      1

I did a df1==df2

df3
     Value Value2
Name
a    True  True 
b    True  False
c    False True

I want to return only b and c, how can I do it? I do not want to do

df3[(df3['Value']==False)|(df3['Value2'==False)] 

because I may have more than 2 columns and columns names can differ

Assuming you have a data file (d3.txt) or list of data like (line),

line = [i.strip().split() for i in open("d3.txt").readlines()]

print line 
[['#df3'], ['#', 'Value', 'Value2'], ['#Name'], ['#a', 'True', 'True'], ['#b', 'True', 'False'], ['#c', 'False', 'True']]

 for i in line[:][:]:
    mydict[i[0]] = ",".join(li[li.index(i)][1:])

I just created a dictionary. So you can call

print mydict
print mydict['#a'] #Depend of which name you want to look. 

The output is

{'#': 'Value,Value2', '#c': 'False,True', '#b': 'True,False', '#a': 'True,True', '#Name': '', '#df3': ''}
True,True

Or you can do this way without creating the dictionary,

for n in range(len(line)):
    if (line[n][0] == '#c' or line[n][0]== '#b'):
        print line[n][:]

And the output is (maybe this is what you want):

['#b', 'True', 'False']  
['#c', 'False', 'True']

我认为这应该做到:

df3[~df3.all(axis=1)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM