简体   繁体   中英

How to compare 3 columns of DataFrame together, Python 3.6

I have below dataframe and I want to compare 3 columns value & update True/False in another column "Id_Name_Table_Matching"

Below my code:

L1_ID = ['Region', 'Col2', 'Col3', 'Col4', 'Col5']
L1_Name = ['Region', 'Col2', 'Col3', 'Col4', 'Col5']
L1_Table = ['Region', 'Col2', 'Col3', 'Col4', 'Col5']

DF1 = pd.DataFrame({'dimId': L1_ID, 'dimName': L1_Name, 'sqlTableColumn': L1_Table})

I want to update true in "Id_Name_Table_Matching" if all columns value matches else false. I need script like below:

DF1['Id_Name_Table_Matching'] = DF1['dimId'] == DF1['dimName'] == DF1['sqlTableColumn']

Compare first columns with second, then with last and chain boolena masks by & for bitwise AND :

DF1['Id_Name_Table_Matching'] = (DF1['dimId'] == DF1['dimName']) & 
                                (DF1['dimId'] == DF1['sqlTableColumn'])

General solution for compare multiple columns defined in list - all filtered columns compare by first one by DataFrame.eq and then check if all values per rows are True s by DataFrame.all :

cols = ['dimId','dimName','sqlTableColumn']
DF1['Id_Name_Table_Matching'] = DF1[cols].eq(DF1[cols[0]], axis=0).all(axis=1)
print (DF1)
    dimId dimName sqlTableColumn  Id_Name_Table_Matching
0  Region  Region         Region                    True
1    Col2    Col2           Col2                    True
2    Col3    Col3           Col3                    True
3    Col4    Col4           Col4                    True
4    Col5    Col5           Col5                    True

Detail :

print (DF1[cols].eq(DF1[cols[0]], axis=0))
   dimId  dimName  sqlTableColumn
0   True     True            True
1   True     True            True
2   True     True            True
3   True     True            True
4   True     True            True

Look if this helps. Using .apply()

df["Id_Name_Table_Matching"] = df.apply(lambda x: x.dimId == x.dimName == x.sqlTableColumn, axis = 1)
print(df)

Output:

    dimId dimName sqlTableColumn  Id_Name_Table_Matching
0  Region  Region         Region                    True
1    Col2    Col2           Col2                    True
2    Col3    Col3           Col3                    True
3    Col4    Col4           Col4                    True
4    Col5    Col5           Col5                    True

You can also use, T ranspose with .nunique() like this:

DF1.T.nunique().le(1)

0    True
1    True
2    True
3    True
4    True
dtype: bool

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM