[英]Python: Read CSV file and write to another text file
I have this .csv file ... 我有这个.csv文件......
id,first_name,last_name,email,date,opt-in,unique_code
1,Jimmy,Reyes,jreyes0@macromedia.com,12/29/2016,FALSE,ER45DH
2,Doris,Wood,dwood1@1und1.de,04/22/2016,,MU34T3
3,Steven,Miller,smiller2@go.com,07/31/2016,FALSE,G34FGH
4,Earl,Parker,eparker3@ucoz.com,01-08-17,FALSE,ASY67J
5,Barbara,Cruz,bcruz4@zdnet.com,12/30/2016,FALSE,NHG67P
If the opt-in value is empty, its should print "0". 如果选择加入值为空,则应打印“0”。 The last value in csv should print first , and then all the name, value pairs in a specific format, like shown in the expected output file below.
应首先打印csv中的最后一个值,然后打印特定格式的所有名称,值对,如下面预期的输出文件中所示。
My expected output 我的预期产量
ER45DH<tab>"id"="1","first_name"="Jimmy","last_name"="Reyes","email"="jreyes0@macromedia.com","date"="12/29/2016","opt-in"="FALSE"
MU34T3<tab>"id"="2","first_name"="Doris","last_name"="Wood","email"="dwood1@1und1.de","date"="04/22/2016,"opt-in"="0"
.......
My code so far .. 我的代码到目前为止..
import csv
with open('newfilename.csv', 'w') as f2:
with open('mycsvfile.csv', mode='r') as infile:
reader = csv.reader(infile)
for i,rows in enumerate(reader):
if i == 0:
header = rows
else:
if rows[5] == '':
rows[5] = 0;
pat = rows[0]+'\t'+'''"%s"="%%s",'''*(len(header)-2)+'''"%s"="%%s"\n'''
print pat
f2.write(pat % tuple(header[1:]) % tuple(rows[1:]))
f2.close()
This code produces this output 此代码生成此输出
1 "first_name"="Jimmy","last_name"="Reyes","email"="jreyes0@macromedia.com","date"="12/29/2016","opt-in"="FALSE","unique_code"="ASD34R"
2 "first_name"="Doris","last_name"="Wood","email"="dwood1@1und1.de","date"="04/22/2016","opt-in"="0","unique_code"="SDS56N"
As you can see column "id" is missing, and I want unque_code at first place. 正如您所看到的,“id”列缺失,我想在第一个位置使用unque_code。
I will really appreciate any help/ideas/pointers. 我将非常感谢任何帮助/想法/指针。
Thanks 谢谢
You could just modify the way you enter your list in the file like this: 你可以修改你在文件中输入列表的方式,如下所示:
# -*- encoding: utf-8 -*-
import csv
with open('newfilename.csv', 'w') as f2:
with open('mycsvfile.csv', mode='r') as infile:
reader = list(csv.reader(infile)) # load the whole file as a list
header = reader[0] # the first line is your header
for row in reader[1:]: # content is all the other lines
if row[5] == '':
row[5] = 0
line = row[-1]+'\t' # adding the unique code
for j, e in enumerate(row[:-2]):
line += '"'+header[j]+'"="'+e+'",' # adding elements in order
f2.write(line[:-1]+'\n') # writing line without last comma
I modified a little bit the way you get the header, in order to avoid an unnecessary test for all the lines. 我按照你获得标题的方式进行了一些修改,以避免对所有行进行不必要的测试。
If your file is really big and/or you don't want to load it entirely in memory, you could modify to: 如果您的文件非常大并且/或者您不想将其完全加载到内存中,则可以修改为:
...
reader = csv.reader(infile) # no conversion to list
header = next(reader) # get first line
for row in reader: # continue to read one line per loop
...
You should process separately the header line, and then correctly process each line. 您应该单独处理标题行,然后正确处理每一行。 You code could become:
你的代码可以成为:
with open('newfilename.csv', 'w') as f2:
with open('mycsvfile.csv', mode='r') as infile:
reader = csv.reader(infile)
header = next(reader) # store the headers and advance reader pointer
for rows in reader:
if rows[5]=="": rows[5] = "0" # special processing for 6th field
# uses last field here
pat = rows[-1]+'\t'+'''"%s"="%%s",'''*(len(header)-2)+'''"%s"="%%s"\n'''
# process everything except last field
fd2.write((pat % tuple(header[:-1])) % tuple(rows[:-1]))
No need to load the whole file in memory... 无需将整个文件加载到内存中......
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.