[英]csv.reader misses first line
I am using csv.reader
in python to read a csv file into a dictionary. 我在python中使用
csv.reader
将csv文件读入字典。 The first column of the csv is a date (in one of 2 possible formats) which is read in as a datetime object and becomes the key of the dict
, and I also read columns 3 and 4: csv的第一列是日期(采用2种可能的格式之一),它作为datetime对象读取,并成为
dict
的键,我还阅读了第3列和第4列:
import datetime as dt
import csv
with open(fileInput,'r') as inFile:
csv_in = csv.reader(inFile)
try:
dictData = {(dt.datetime.strptime(rows[0], '%d/%m/%Y %H:%M')): [rows[3], rows[4]]
for rows in csv_in}
except:
dictData = {(dt.datetime.strptime(rows[0], '%Y-%m-%d %H:%M:%S')): [rows[3], rows[4]]
for rows in csv_in}
It works, except that the first date in the file ( 1/7/2012 00:00
) doesn't appear in the dictionary. 它起作用,除了文件中的第一个日期(
1/7/2012 00:00
)没有出现在字典中。 Do I need to tell csv.reader
that the first row is not a header row and if so, how? 我需要告诉
csv.reader
第一行不是标题行吗?
When you run your try
, except
statement, it is easy to believe that python will first try
something, and if that fails, revert your environment back to the state it was in before the try
statement was executed. 当您运行
try
, except
语句时,很容易相信python会首先try
某些操作,如果失败,请将您的环境恢复到执行try
语句之前的状态。 It does not do this. 它不会这样做。 As such, you have to be aware of unintended side effects that might occur from a failed
try
attempt. 因此,您必须注意
try
尝试失败可能产生的意外副作用。
What is happening in your case is that the dictionary comprehension calls next(...)
on your csv.reader()
object ( csv_in
), which returns the first line in the csv file. 在您的情况下,词典理解会在
csv.reader()
对象( csv_in
)上调用next(...)
),这将返回csv文件中的第一行。 You have now used up the first item from the csv.reader()
iterator. 现在,您已经用完了
csv.reader()
迭代器中的第一项。 Remember, Python won't revert to a previous state if the try
block fails. 请记住,如果
try
块失败,Python不会恢复到以前的状态。
An exception is then raised, I'm presuming when the date is in the wrong format. 然后引发异常,我推测日期格式错误。 When the
except
block then takes over, and calls next(...)
on your csv_in
object, you then get the second item in the iterator. 然后,
except
块接管并在csv_in
对象上调用next(...)
时,您将在迭代器中获得第二项。 The first has already been used. 第一个已经被使用。
A simple change to get around this is to make a copy of the csv iterator object. 解决此问题的一个简单更改是制作csv迭代器对象的副本。
import datetime as dt
import csv
from copy import copy
with open(fileInput,'r') as inFile:
csv_in = csv.reader(inFile)
try:
dictData = {(dt.datetime.strptime(rows[0],'%d/%m/%Y %H:%M')):
[rows[3],rows[4]] for rows in copy(csv_in)}
except ValueError:
dictData = {(dt.datetime.strptime(rows[0],'%Y-%m-%d %H:%M:%S')):
[rows[3],rows[4]] for rows in copy(csv_in)}
Finally, I would recommend against catching a generic Exception
. 最后,我建议不要捕获通用的
Exception
。 I think you would be wanting to catch a ValueError
. 我认为您可能想抓住
ValueError
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.