简体   繁体   中英

Python: Compare two dataframes in Python with different number rows and a Compsite key

I have two different dataframes which i need to compare.

These two dataframes are having different number of rows and doesnt have a Pk its Composite primarykey of (id||ver||name||prd||loc)

df1:

id ver name   prd  loc
a  1   surya  1a   x
a  1   surya  1a   y
a  2   ram    1a   x
b  1   alex   1b   z
b  1   alex   1b   y
b  2   david  1b   z

df2:

id ver name   prd  loc
a  1   surya  1a   x
a  1   surya  1a   y
a  2   ram    1a   x
b  1   alex   1b   z

I tried the below code and this workingif there are same number of rows, but if its like the above case its not working.

df1 = pd.DataFrame(Source)
df1 = df1.astype(str) #converting all elements as objects for easy comparison

df2 = pd.DataFrame(Target)
df2 = df2.astype(str) #converting all elements as objects for easy comparison


header_list =  df1.columns.tolist() #creating a list of column names from df1 as the both df has same structure

df3 = pd.DataFrame(data=None, columns=df1.columns, index=df1.index)

    for x in range(len(header_list)) :

        df3[header_list[x]] = np.where(df1[header_list[x]] == df2[header_list[x]], 'True', 'False')

df3.to_csv('Output', index=False)

Please leet me know how to compare the datasets if there are different number od rows.

You can try this:

~df1.isin(df2)
# df1[~df1.isin(df2)].dropna()

Lets consider a quick example:

df1 = pd.DataFrame({
'Buyer': ['Carl', 'Carl', 'Carl'],
'Quantity': [18, 3, 5, ]})

#    Buyer  Quantity
# 0  Carl        18
# 1  Carl         3
# 2  Carl         5

df2 = pd.DataFrame({
'Buyer': ['Carl', 'Mark', 'Carl', 'Carl'],
'Quantity': [2, 1, 18, 5]})

#    Buyer  Quantity
# 0  Carl         2
# 1  Mark         1
# 2  Carl        18
# 3  Carl         5


~df2.isin(df1)

#    Buyer  Quantity
# 0  False  True
# 1  True   True
# 2  False  True
# 3  True   True


df2[~df2.isin(df1)].dropna()

#   Buyer   Quantity
# 1 Mark    1
# 3 Carl    5

Another idea can be merge on the same column names.

Sure, tweak the code to your needs. Hope this helped:)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM