Find matching and non matching recs panda

Question

I have 2 csv files with data given in this fashion. How can i perform a basic matching and produce a result like the output. I'm matching based on the websites field. that's the key i'm using here for matching.

I tried Efficiently find matching rows (based on content) in a pandas DataFrame and https://macxima.medium.com/python-retrieve-matching-rows-from-two-dataframes-d22ad9e71879

but i'm not getting my desired output. Any assistance would be helpful

file1.csv
| id | web_1  |
|----|------|
| 1  | google.com |
| 2  | microsoft.in |
| 3  | yahoo.uk |
| 4  | adobe.us |


file2.csv
| id | web_2 |
|----|-----|
|2| microsoft.in |
| 3  | yahoo.uk |
| 4  | adobe.us |


output 
| id | web_1  | web_2  |
|----|------|--------|
| 1  | google.com | |
| 2  | microsoft.in | microsoft.in |
| 3  | yahoo.uk | yahoo.uk |
| 4  | adobe.us | adobe.us |

Answer 1

Based on your comment if you want to merge the dataframes in a way where the result only includes rows where the merge keys match you can do an inner join.

pandas.DataFrame.merge uses 'inner' as the default type of merge.

import pandas as pd

df1 = pd.DataFrame(
    {
        "id": [1, 2, 3, 4],
        "web_1": ["google.com", "microsoft.in", "yahoo.uk", "adobe.us"],
    }
)
df2 = pd.DataFrame(
    {
        "id": [2, 3, 4],
        "web_2": ["microsoft.in", "yahoo.uk", "adobe.us"],
    }
)

>>>  pd.merge(df1, df2)
   id         web_1         web_2
0   2  microsoft.in  microsoft.in
1   3      yahoo.uk      yahoo.uk
2   4      adobe.us      adobe.us

If you don't want to keep both web columns you can just drop one of them:

>>> pd.merge(df1, df2).drop(columns='web_2')
   id         web_1
0   2  microsoft.in
1   3      yahoo.uk
2   4      adobe.us

Drop and rename:

pd.merge(df1, df2).drop(columns='web_2').rename(columns={'web_1': 'web'})
   id           web
0   2  microsoft.in
1   3      yahoo.uk
2   4      adobe.us

Find matching and non matching recs panda

Question

1 answers

solution1
0 2021-06-25 13:31:24

Find matching and non matching recs panda

Question

1 answers

solution1 0 2021-06-25 13:31:24

solution1
0 2021-06-25 13:31:24