如何合并两个大的CSV文件？

Question

I have two large .csv files that I would like to join. 我有两个要加入的大型.csv文件。

file1.csv has the following structure: file1.csv具有以下结构：

productcode; *many useless columns* ; startdate; enddate; *some other useless columns*

file2.csv has the following structure: file2.csv具有以下结构：

productcode; *many useless columns different from file1* ; page; startdate; enddate; *some othe useless columns*

I would like to join the two files into a file (let's say, out.csv ) with the same structure as file1.csv but with the "page" column from file2.csv, ie 我想将两个文件连接到一个具有与out.csv相同的结构但具有file2.csv中的“ page”列的文件（例如out.csv ），即

productcode; *useless columns* ; page; startdate; enddate; *useless columns*

The join conditions are same productcode and overlapping dates, ie: 加入条件是相同的产品代码和重叠的日期，即：

file1.productcode == file2.productcode

and 和

!(file1.endate<file2.startdate or file2.enddate<file1.startdate)

However, I have no idea on how to do that. 但是，我不知道该怎么做。 One possibility could be to export the two CSVs into MySql, process them and then export the result in a final CSV file. 一种可能是将两个CSV导出到MySql，对其进行处理，然后将结果导出到最终CSV文件中。 However, that takes time (and is resource consuming). 但是，这需要时间（并且很耗资源）。

I'm open to any suggestions. 我愿意接受任何建议。

Answer 1

使用pandas加载它们，并使用.join（）函数将两者与所需的列引用结合在一起

如何合并两个大的CSV文件？

问题描述

1 个解决方案

解决方案1
0 已采纳 2017-05-12 16:04:26

如何合并两个大的CSV文件？

问题描述

1 个解决方案

解决方案1 0 已采纳 2017-05-12 16:04:26

解决方案1
0 已采纳 2017-05-12 16:04:26