[英]Merge two CSV files based on a data from the first column
I have two csv files like below that I'd like to merge - more or less using the first column ID_ as the unique identifier, and append the AMT column to a new row in the final file. 我有两个要合并的csv文件,如下所示-或多或少地使用第一列ID_作为唯一标识符,并将AMT列附加到最终文件中的新行。
CSV1 CSV1
ID_ CUSTOMER_ID_ EMAIL_ADDRESS_
1090 1 example1@example.com
1106 2 example2@example.com
1145 3 example3@example.com
1206 4 example4@example.com
1247 5 example5@example.com
1254 6 example6@example.com
1260 7 example7@example.com
1361 8 example8@example.com
1376 9 example9@example.com
CSV2 CSV2
ID_ AMT
1090 5
1106 5
1145 5
1206 5
1247 5
1254 65
1260 5
1361 10
1376 5
Here's what I'm looking for in a final file: 这是我在最终文件中寻找的内容:
ID_ CUSTOMER_ID_ EMAIL_ADDRESS_ AMT
1090 1 example1@example.com 5
1106 2 example2@example.com 5
1145 3 example3@example.com 5
1206 4 example4@example.com 5
1247 5 example5@example.com 5
1254 6 example6@example.com 65
1260 7 example7@example.com 5
1361 8 example8@example.com 10
1376 9 example9@example.com 5
I've tried modifying a this below as much as possible, but not able to get what I'm looking for. 我尝试过尽可能在下面修改此内容,但无法获得我想要的内容。 Really stuck on this - not sure what else I can do.
真的卡在了这里-不知道我还能做什么。 Really appreciate any and all help!
非常感谢任何帮助!
join -t, File1.csv File2.csv
Data shows in this example contains tabs, but my actual files are CSVs as mentioned and will contain commas as a separator. 此示例中显示的数据包含选项卡,但是我的实际文件是上述的CSV,并且将逗号作为分隔符。
This can be easily done using Pandas library. 使用Pandas库可以很容易地做到这一点。 Here is my code to do this:
这是我执行此操作的代码:
'''
This program reads two csv files and merges them based on a common key column.
'''
# import the pandas library
# you can install using the following command: pip install pandas
import pandas as pd
# Read the files into two dataframes.
df1 = pd.read_csv('CSV1.csv')
df2 = pd.read_csv('CSV2.csv')
# Merge the two dataframes, using _ID column as key
df3 = pd.merge(df1, df2, on = 'ID_')
df3.set_index('ID_', inplace = True)
# Write it to a new CSV file
df3.to_csv('CSV3.csv')
You can find a short tutorial on pandas here: https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html 您可以在此处找到有关熊猫的简短教程: https : //pandas.pydata.org/pandas-docs/stable/getting_started/10min.html
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.