take a join of two csv files over a common column in python

Question

I have two csv files which have following fields:

FILE 1 :

objectID,objectName,objecttype

FILE 2 :

objectID,objectprice,objecttotalprice

the data in these two files is separated by , . What I want is to take a join of these two files over objectID . The output should have joined data and the data of file 1 which did not matched with file 2. I tried this code but it is not giving correct output:

import pandas as pd

a = pd.read_csv("file1.csv", names = ["objectID", "objectName", "objecttype"],header = 0).astype(basestring)

    b = pd.read_csv("file1.csv").astype(basestring)

    merged= a.merge(b, on='objectID',how='outer')

    merged.to_csv("output.csv", index=False)

When I run this then in output I get data of file1 (with empty value for fields of file2 ) followed by data of file2 (with empty value for fields of file1 ).

What am I doing wrong here and how can I do the join correctly

NOTE: In file1 the filed names are a bit different and hence I am renaming them when I am reading file1.csv above

Answer 1

I think you are looking for a left join, try

merged= a.merge(b, on='objectID', how='left')

It works like SQL (see the documentation )

take a join of two csv files over a common column in python

Question

1 answers

solution1
1 2014-11-26 11:08:52

take a join of two csv files over a common column in python

Question

1 answers

solution1 1 2014-11-26 11:08:52

solution1
1 2014-11-26 11:08:52