简体   繁体   中英

Comparing two dataframes with same columns and different rows

i have 2 dataframes df1, df2 and i want to compare these dataframes.

import pandas as pd

df1 = pd.DataFrame({'Column1': ['f','c','b','d','e','g','h'], 
                    'Column2': ['1','2','3','4','5','7','8']})

df2 = pd.DataFrame({'Column1': ['f','b','d','e','a','g'], 
                    'Column2': ['1','3','4','5','6','7']})

To compare dataframes, i use pandas merge. here's my code.

df = pd.merge(df1,df2, how="outer", on="Column1")

And Result:

  Column1 Column2_x Column2_y
  0       f         1         1
  1       c         2       NaN
  2       b         3         3
  3       d         4         4
  4       e         5         5
  5       g         7         7
  6       h         8       NaN
  7       a       NaN         6

But i don't want this result... the output what i want is below:

how can i get the output???

  What i want :

  Column1 Column2_x Column2_y
  0       f         1         1
  1       c         2       NaN
  2       b         3         3
  3       d         4         4
  4       e         5         5
  5       a       NaN         6
  6       g         7         7
  7       h         8       NaN

It looks like you want to preserve the row order in df1 . Try:

(df1.assign(enum=np.arange(len(df1)))
    .merge(df2, on='Column1', how='outer')
    .sort_values('enum')
    .drop(columns=['enum'])
)

Output:

  Column1 Column2_x Column2_y
0       f         1         1
1       c         2       NaN
2       b         3         3
3       d         4         4
4       e         5         5
5       g         7         7
6       h         8       NaN
7       a       NaN         6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM