简体   繁体   English

想要使用列值比较两个数据框

[英]want to compare two data frames using a column value

I have two similar data frames.我有两个相似的数据框。 I want to compare the values using the column 1 values.我想使用第 1 列的值来比较这些值。

emp ID  FirstName Lastname
1       Prasanna  K
2       Siva      B

emp ID  FirstName Lastname
1       Prasana   K
2       Siva      B
3       Karunas   Y

I want to compare two DF comparing the Emp ID and identify the unique, non-unique, and New items我想比较两个比较 Emp ID 的 DF 并识别唯一、非唯一和新项目

Thanks..谢谢..
-Prasanna.K -Prasanna.K

You can use something like the one given below,您可以使用下面给出的类似的东西,

>>> import pandas as pd
>>> import numpy as np
>>> 
>>> dictA = {'emp ID': [0, 1],'FirstName': ['Prasanna', 'Siva'],'LastName': ['K','B']
...          }
>>> 
>>> dictB = {'emp ID': [0, 1, 2],'FirstName': ['Prasanna', 'Siva', 'Karunas'],'LastName': ['K','B','Y']
...          }
>>> 
>>> 
>>> 
>>> 
>>> dfA = pd.DataFrame(dictA)
>>> dfB = pd.DataFrame(dictB)
>>> 
>>> 
>>> 
>>> dfA
   emp ID FirstName LastName
0       0  Prasanna        K
1       1      Siva        B
>>> dfB
   emp ID FirstName LastName
0       0  Prasanna        K
1       1      Siva        B
2       2   Karunas        Y
>>> 

# For checking whether there are some unique values of dataframe B which are not present in dataframe A
>>> dfB['present'] = dfB['emp ID'].isin(dfA['emp ID'])
>>> dfB
   emp ID FirstName LastName  present
0       0  Prasanna        K     True
1       1      Siva        B     True
2       2   Karunas        Y    False

# For checking whether there are unique values of dataframe A which are not present in dataframe B
>>> dfA['present'] = dfA['emp ID'].isin(dfB['emp ID'])
>>> dfA
   emp ID FirstName LastName  present
0       0  Prasanna        K     True
1       1      Siva        B     True

Edit as per the comment of the OP根据OP的评论进行编辑

>>> import pandas as pd 
>>> import numpy as np
>>>  
... dictA = {'emp ID': [0, 1,2,3],'FirstName': ['Prasanna', 'Siva','Bala','foo'],'LastName': ['K','B','Y','Y_F']
... }
>>>  
... dictB = {'emp ID': [0, 1, 2],'FirstName': ['Prasanna', 'Siva', 'Karunas'],'LastName': ['K','B','Y']        }
>>>  
...  
...  
...  
... dfA = pd.DataFrame(dictA)
>>> dfB = pd.DataFrame(dictB)
>>>  
... 
>>> 
>>>  
... dfA
   emp ID FirstName LastName
0       0  Prasanna        K
1       1      Siva        B
2       2      Bala        Y
3       3       foo      Y_F
>>> 
>>> dfB
   emp ID FirstName LastName
0       0  Prasanna        K
1       1      Siva        B
2       2   Karunas        Y
>>> 
>>>  
... 
>>> # For checking whether there are some unique values of dataframe B which are not same in dataframe A (for all columns together)
... 
>>> dfB['same_all'] = dfB['emp ID'].isin(dfA['emp ID']) & dfB['FirstName'].isin(dfA['FirstName']) &  dfB['LastName'].isin(dfA['LastName'])
>>>  
... 
>>> dfB
   emp ID FirstName LastName  same_all
0       0  Prasanna        K      True
1       1      Siva        B      True
2       2   Karunas        Y     False
>>> 
>>> # Or for checking each column separately you can use something like for dataframe A
... dfB['same_emp_ID'] = dfB['emp ID'].isin(dfA['emp ID']) 
>>> 
>>> dfB['same_FirstName'] = dfB['FirstName'].isin(dfA['FirstName'])  
>>> 
>>> dfB['same_LastName'] = dfB['LastName'].isin(dfA['LastName'])
>>> 
>>> # For checking whether there are unique values of dataframe A which are not same in dataframe B (for all columns together)
... 
>>> dfA['same_all'] = dfA['emp ID'].isin(dfB['emp ID']) & dfA['FirstName'].isin(dfB['FirstName']) &  dfA['LastName'].isin(dfB['LastName'])
>>> 
>>> 
>>> dfA
   emp ID FirstName LastName  same_all
0       0  Prasanna        K      True
1       1      Siva        B      True
2       2      Bala        Y     False
3       3       foo      Y_F     False
>>> 
>>> 
>>> # Or for checking each column separately you can use something like for dataframe A
... dfA['same_emp_ID'] = dfA['emp ID'].isin(dfB['emp ID']) 
>>> 
>>> dfA['same_FirstName'] = dfA['FirstName'].isin(dfB['FirstName'])  
>>> 
>>> dfA['same_LastName'] = dfA['LastName'].isin(dfB['LastName'])
>>> 
>>> 
>>> dfA
   emp ID FirstName LastName  same_all  same_emp_ID  same_FirstName  same_LastName
0       0  Prasanna        K      True         True            True           True
1       1      Siva        B      True         True            True           True
2       2      Bala        Y     False         True           False           True
3       3       foo      Y_F     False        False           False          False
>>> 
>>> dfB
   emp ID FirstName LastName  same_all  same_emp_ID  same_FirstName  same_LastName
0       0  Prasanna        K      True         True            True           True
1       1      Siva        B      True         True            True           True
2       2   Karunas        Y     False         True           False           True
>>> 

Taken from here取自这里

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM