I have two csv
files which have following fields:
FILE 1 :
objectID,objectName,objecttype
FILE 2 :
objectID,objectprice,objecttotalprice
the data in these two files is separated by ,
. What I want is to take a join of these two files over objectID
. The output should have joined data and the data of file 1 which did not matched with file 2. I tried this code but it is not giving correct output:
import pandas as pd
a = pd.read_csv("file1.csv", names = ["objectID", "objectName", "objecttype"],header = 0).astype(basestring)
b = pd.read_csv("file1.csv").astype(basestring)
merged= a.merge(b, on='objectID',how='outer')
merged.to_csv("output.csv", index=False)
When I run this then in output I get data of file1
(with empty value for fields of file2
) followed by data of file2
(with empty value for fields of file1
).
What am I doing wrong here and how can I do the join correctly
NOTE: In file1
the filed names are a bit different and hence I am renaming them when I am reading file1.csv above
I think you are looking for a left join, try
merged= a.merge(b, on='objectID', how='left')
It works like SQL (see the documentation )
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.