[英]How can I create a new csv after finding the header row?
I am reading a csv file that has about 7-8 lines above that are a description of my file. 我正在读取一个csv文件,上面有大约7-8行,是对我的文件的描述。 I am getting to the first column by using the following code :
我通过使用以下代码进入第一列:
list_of_files = glob.glob('C:/payment_reports/*csv') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
print (latest_file)
line_count = None
for row in csv.reader(open(latest_file)):
if row[0] == 'date/time':
print (row)
break
else:
print("{} not found".format('name'))
I am getting to correct line since the row that prints is: 我要更正一行,因为打印的行是:
['date/time', 'settlement id', 'type', 'order id', 'sku', 'description', 'quantity', 'marketplace', 'fulfillment', 'order city', 'order state', 'order postal', 'product sales', 'shipping credits', 'gift wrap credits', 'promotional rebates', 'sales tax collected', 'Marketplace Facilitator Tax', 'selling fees', 'fba fees', 'other transaction fees', 'other', 'total']
Now how do I save the column + all the rows after as a new csv? 现在,如何将列+之后的所有行另存为新的csv? I have a line_count, but before I try it with a new variable, I am sure there are functions in the csv using the index of the row that I can use to make things more simple.
我有一个line_count,但是在尝试使用新变量之前,我确定在csv中有一些使用行索引的函数,可以用来简化事情。 What do you guys suggest is the best way to do this.?
你们建议什么是最好的方法?
Solution: thanks @bruno desthuilliers 解决方案:谢谢@bruno desthuilliers
list_of_files = glob.glob('C:/payment_reports/*csv') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
print (latest_file)
with open(latest_file, "r") as infile:
reader = csv.reader(infile)
for row in reader:
if row[0] == 'date/time':
print (row)
break
else:
print("{} not found".format('name'))
break
with open("C:/test.csv", "w") as outfile:
writer = csv.writer(outfile)
writer.writerow(row) # headers
writer.writerows(reader) # remaining rows
csv.reader
is an iterator. csv.reader
是一个迭代器。 It reads a line from the csv every time that .next
is called. 每次调用
.next
时,它都会从csv中读取一行。
Here's the documentation: http://docs.python.org/2/library/csv.html . 这是文档: http : //docs.python.org/2/library/csv.html 。
An iterator object can actually return values from a source that is too big to read all at once. 实际上,迭代器对象可以从太大而无法一次读取所有数据的源中返回值。 using a for loop with an iterator effectively calls
.next
on each time through the loop. 将for循环与迭代器配合使用,每次在循环中都有效地调用
.next
。 hope this helps? 希望这可以帮助?
Once you found the headers row, you can write it and the remaining rows to your outfile: 找到标头行后,您可以将其写出,并将其余行写到输出文件中:
with open(latest_file, "rb") as infile:
reader = csv.reader(infile)
for row in reader:
if row[0] == 'date/time':
break
else:
print("{} not found".format('name'))
return
with open("path/to/new.csv", "wb") as outfile:
writer = csv.writer(outfile)
writer.writerow(row) # headers
writer.writerows(reader) # remaining rows
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.