[英]How do i read two CSV files, then merge their data, and write to one CSV file in Python?
[英]Python: How to write the difference of two CSV files, with one having a subset of columns, in a new CSV file
父级CSV文件:
Name ID Address Age Phone Email
John 123 New York 32 24... email@something.some
George 231 London 24 21... email2@something.some
Adam 321 Berlin 12 71... email3@something.some
... .... ... ... ... ...
第二个CSV文件:
Name ID Age Email
Adam 321 Berlin email3@something.some
George 231 London email2@something.some
... .... ... ...
我一直在尝试创建一个新的CSV文件,该文件包含第一个CSV文件中的数据,但不包含第二个CSV文件中的数据,并带有附加的列信息(例如,还希望包括相应ID的地址数据),以便CSV文件如下所示:
Name ID Age Address Email
John 123 New York 32 24... email@something.some
... .... ... ... ...
我尝试使用此方法比较2个单独的csv文件,并将差异写入新的csv文件-Python 2.7,但我无法包含“地址”列(已知索引)以将其写入新的CSV文件,同时还提取和写其余的区别。 也许有一个更快的选项可以由ID运行并将其立即写入新文件。
如果我理解正确并且您想要这些列:
Name ID Age Address Email
您可以使用字典来实现,为第一个文件创建字典,键为ID,值为行。 然后,您可以查看所需的ID,并根据需要重建每一行。
import csv
# Make a dictionary from the first file (id, row)
f1 = open ("file1.csv")
file1 = csv.reader(f1)
file1RowDict = {}
for row in file1:
file1RowDict[row[1]] = row
f1.close()
# Make a list of all the Ids in the second file
f2 = open ("file2.csv")
file2 = csv.reader(f2)
file2Ids = []
for row in file2:
file2Ids.append(row[1])
f2.close()
# Find Ids which are only on the first file
ids = [id for id in file1RowDict.keys() if id not in file2Ids]
newFileRows = []
# Grab the header row and reorder as you want
headerRow = file1RowDict['ID']
headerRow = [headerRow[0], headerRow[1], headerRow[3], headerRow[2], headerRow[5]]
newFileRows.append(headerRow)
# Grab each data row and reorder as you want
for id in ids:
row = file1RowDict[id]
newRow = [row[0], row[1], row[3], row[2], row[5]]
newFileRows.append(newRow)
print(str(newFileRows))
输出:
[
['Name', 'ID', 'Age', 'Address', 'Email'],
['John', '123', '32', 'New York', 'email@something.some']
]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.