简体   繁体   中英

Combining CSV's with Different Columns using Pandas [with key column]

I'm trying to combine two CSV files in Python, each CSV file has unique columns but both CSV files share a common key column.

I've been looking around StackOverflow/Google/Pandas documentation but didn't find exactly what I was looking for. The examples provided on the Pandas documentation pages for merge and concat are different from what I'm trying to achieve so I'm not sure if what I'm asking is possible with Pandas.

I've read in selected columns from both CSV files into separate dataframes, what I would like to do now is combine the two dataframes into a single dataframe based on the key column.

Example

CSV 1:
Key   Make   Model
501   Audi   A3
502   Audi   A4
503   Audi   A5

CSV 2:
Key   Engine
501   2.0T
502   2.0T
503   2.0T

Combined Expected Result:
Key   Make   Model   Engine
501   Audi   A3      2.0T
502   Audi   A4      2.0T
503   Audi   A5      2.0T

You need to read your csvs into 2 separate data frames and then join them on 'Key' column.

import pandas as pd
df1 = pd.read_csv('csv1.csv')
df2 = pd.read_csv('csv2.csv')
df_final = df1.merge(df2, left_on = 'Key', right_on = 'Key')

Kacper Sobociński answer is correct, you can use pandas merge.

import pandas as pd

data1 = {'Key': [501,502,503], 
        'Make': ['Audi','Audi','Audi'],
        'Model': ['A3','A4','A5']}

data2 = {'Key':[501,502,503],
         'Engine': ['2.0T', '2.0T','2.0T']}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)


df = pd.merge(df1,df2, how = 'inner', on = 'Key')

print(df)

   Key  Make Model Engine
0  501  Audi    A3   2.0T
1  502  Audi    A4   2.0T
2  503  Audi    A5   2.0T

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM