简体   繁体   中英

How can I merge multiple different CSV files with foreign keys?

Good Evening, I've been trying to work with the Instacart Dataset as a part of my online classes using Jupyter Notebook (Python); one of the requirements is to merge all of the files (that come mostly with different columns and one or two foreign keys) into one big CSV, like in this case:

https://github.com/gabrielhpr/InstacartClustering/blob/master/InstacartClustering.ipynb

However I don't know how to accomplish that, each file comes with a foreign key so I guess that's the way to go, but how do you match those foreign keys to the correct rows and compile all the CSV files?

Yes you can it very easily

  1. Set the index column as foreign keys

    df.set_index(foreign_key)

  2. use pd.concat([df1,df2],axis=1) to merge those two dataframes.

Using these two processes you will be able to merge those two CSV files with a foreign key.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM