简体   繁体   English

根据两列匹配来自两个csv文件的数据,并使用选定的列创建一个新的csv文件

[英]Matching data from two csv files based on two columns and creating a new csv file with selected columns

I have two csv files. 我有两个csv文件。

one.csv: one.csv:

1, 12.1455675, -13.1287564, 23, 9, 4.5, 4
2, 12.5934593, -13.0856385, 14, 5, 9.7, 6
3, 12.0496204, -13.8938582, 14, 6, 3.4, 9
4, 12.1456084, -12.1939589, 45, 2, 3.4, 8

two.csv: two.csv:

9, 12.0496, -13.8939, .3, 55
3, 12.1456, -13.1288, 3.4, 9

What I want to do is match the two csv files based on columns one and two. 我要做的是根据第一和第二列匹配两个csv文件。 I want another csv file that has the matched columns 1 and 2, but also includes the corresponding 3rd column values from two.csv and 6th column values from one.csv. 我想要另一个具有匹配的列1和2的csv文件,但还包括来自two.csv的相应第三列值和来自one.csv的第六列值。 Like this: 像这样:

12.0496, -13.8939, 55, 3.4
12.1456, -12.1288, 9, 4.5

I am unsure how to go about this especially when some of the values in two.csv are rounded. 我不确定如何解决这个问题,尤其是当two.csv中的某些值四舍五入时。

Any help is greatly appreciated! 任何帮助是极大的赞赏!

You could use pandas ' io to read/write csv files and its database-style joining/merging capabilities to merge the files: 您可以使用pandas的io来读取/写入csv文件,并使用其数据库样式的合并/合并功能来合并文件:

import pandas as pd

normalize = lambda x: "%.4f" % float(x) # round
df = pd.read_csv("one.csv", index_col=(0,1), usecols=(1, 2, 5),
                 header=None, converters=dict.fromkeys([1,2], normalize))
df2 = pd.read_csv("two.csv", index_col=(0,1), usecols=(1, 2, 4),
                  header=None, converters=dict.fromkeys([1,2], normalize))
result = df.join(df2, how='inner')
result.to_csv("output.csv", header=None) # write as csv

Result 结果

12.0496,-13.8939,3.4,55
12.1456,-13.1288,4.5,9

This is a quite common question on SO. 这是关于SO的非常普遍的问题。

As of myself, same answer: for a medium-term solution, import in a DB, then perform a query using a JOIN ... 就我自己而言,答案是相同的:对于中期解决方案,请导入数据库,然后使用JOIN执行查询...


Try a search: https://stackoverflow.com/search?q=combining+csv+python 尝试搜索: https//stackoverflow.com/search?q = combining + csv + python

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 比较两个csv文件的多列并将输出保存为匹配/不匹配的新csv文件 - Comparing multiple columns of two csv files and save output as matching/not matching in new csv file 比较 CSV 文件中两列的数据 - Comparing Data from two columns in CSV files 根据用户输入从 csv 文件的两列中检索数据 - Retrieve data from two columns of csv file based on user input 根据两列中的特定数据比较两个CSV文件 - Comparing two CSV Files Based on Specific Data in two Columns 使用另一个选定的列创建新的 CSV - Creating new CSV with selected columns from another python从两个csv文件中提取列并将其合并为一个新的csv文件 - python take columns from two csv files and combine them for a new csv file 将 csv 文件中的两列数据一起添加到 python 中同一 csv 文件中的新列中 - Adding two columns of data together from a csv file into a new column in the same csv file in python 将列数据分成csv文件中的两个新列 - separating column data into two new columns in a csv file 读取两个文件,一个csv和xls,并根据子网(csv)/ ip(xls)匹配将列从csv带到xls - Reading two files, a csv and xls, and bring columns from csv to xls based on subnet(csv)/ip(xls) match 比较两个 csv 文件并在新的 csv 文件中获取 output 是否匹配 - Comparing two csv files and get the output as matching or not in new csv file
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM