簡體   English   中英

想要使用列值比較兩個數據框

[英]want to compare two data frames using a column value

我有兩個相似的數據框。 我想使用第 1 列的值來比較這些值。

emp ID  FirstName Lastname
1       Prasanna  K
2       Siva      B

emp ID  FirstName Lastname
1       Prasana   K
2       Siva      B
3       Karunas   Y

我想比較兩個比較 Emp ID 的 DF 並識別唯一、非唯一和新項目

謝謝..
-Prasanna.K

您可以使用下面給出的類似的東西,

>>> import pandas as pd
>>> import numpy as np
>>> 
>>> dictA = {'emp ID': [0, 1],'FirstName': ['Prasanna', 'Siva'],'LastName': ['K','B']
...          }
>>> 
>>> dictB = {'emp ID': [0, 1, 2],'FirstName': ['Prasanna', 'Siva', 'Karunas'],'LastName': ['K','B','Y']
...          }
>>> 
>>> 
>>> 
>>> 
>>> dfA = pd.DataFrame(dictA)
>>> dfB = pd.DataFrame(dictB)
>>> 
>>> 
>>> 
>>> dfA
   emp ID FirstName LastName
0       0  Prasanna        K
1       1      Siva        B
>>> dfB
   emp ID FirstName LastName
0       0  Prasanna        K
1       1      Siva        B
2       2   Karunas        Y
>>> 

# For checking whether there are some unique values of dataframe B which are not present in dataframe A
>>> dfB['present'] = dfB['emp ID'].isin(dfA['emp ID'])
>>> dfB
   emp ID FirstName LastName  present
0       0  Prasanna        K     True
1       1      Siva        B     True
2       2   Karunas        Y    False

# For checking whether there are unique values of dataframe A which are not present in dataframe B
>>> dfA['present'] = dfA['emp ID'].isin(dfB['emp ID'])
>>> dfA
   emp ID FirstName LastName  present
0       0  Prasanna        K     True
1       1      Siva        B     True

根據OP的評論進行編輯

>>> import pandas as pd 
>>> import numpy as np
>>>  
... dictA = {'emp ID': [0, 1,2,3],'FirstName': ['Prasanna', 'Siva','Bala','foo'],'LastName': ['K','B','Y','Y_F']
... }
>>>  
... dictB = {'emp ID': [0, 1, 2],'FirstName': ['Prasanna', 'Siva', 'Karunas'],'LastName': ['K','B','Y']        }
>>>  
...  
...  
...  
... dfA = pd.DataFrame(dictA)
>>> dfB = pd.DataFrame(dictB)
>>>  
... 
>>> 
>>>  
... dfA
   emp ID FirstName LastName
0       0  Prasanna        K
1       1      Siva        B
2       2      Bala        Y
3       3       foo      Y_F
>>> 
>>> dfB
   emp ID FirstName LastName
0       0  Prasanna        K
1       1      Siva        B
2       2   Karunas        Y
>>> 
>>>  
... 
>>> # For checking whether there are some unique values of dataframe B which are not same in dataframe A (for all columns together)
... 
>>> dfB['same_all'] = dfB['emp ID'].isin(dfA['emp ID']) & dfB['FirstName'].isin(dfA['FirstName']) &  dfB['LastName'].isin(dfA['LastName'])
>>>  
... 
>>> dfB
   emp ID FirstName LastName  same_all
0       0  Prasanna        K      True
1       1      Siva        B      True
2       2   Karunas        Y     False
>>> 
>>> # Or for checking each column separately you can use something like for dataframe A
... dfB['same_emp_ID'] = dfB['emp ID'].isin(dfA['emp ID']) 
>>> 
>>> dfB['same_FirstName'] = dfB['FirstName'].isin(dfA['FirstName'])  
>>> 
>>> dfB['same_LastName'] = dfB['LastName'].isin(dfA['LastName'])
>>> 
>>> # For checking whether there are unique values of dataframe A which are not same in dataframe B (for all columns together)
... 
>>> dfA['same_all'] = dfA['emp ID'].isin(dfB['emp ID']) & dfA['FirstName'].isin(dfB['FirstName']) &  dfA['LastName'].isin(dfB['LastName'])
>>> 
>>> 
>>> dfA
   emp ID FirstName LastName  same_all
0       0  Prasanna        K      True
1       1      Siva        B      True
2       2      Bala        Y     False
3       3       foo      Y_F     False
>>> 
>>> 
>>> # Or for checking each column separately you can use something like for dataframe A
... dfA['same_emp_ID'] = dfA['emp ID'].isin(dfB['emp ID']) 
>>> 
>>> dfA['same_FirstName'] = dfA['FirstName'].isin(dfB['FirstName'])  
>>> 
>>> dfA['same_LastName'] = dfA['LastName'].isin(dfB['LastName'])
>>> 
>>> 
>>> dfA
   emp ID FirstName LastName  same_all  same_emp_ID  same_FirstName  same_LastName
0       0  Prasanna        K      True         True            True           True
1       1      Siva        B      True         True            True           True
2       2      Bala        Y     False         True           False           True
3       3       foo      Y_F     False        False           False          False
>>> 
>>> dfB
   emp ID FirstName LastName  same_all  same_emp_ID  same_FirstName  same_LastName
0       0  Prasanna        K      True         True            True           True
1       1      Siva        B      True         True            True           True
2       2   Karunas        Y     False         True           False           True
>>> 

取自這里

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM