简体   繁体   English

csv.reader错过第一行

[英]csv.reader misses first line

I am using csv.reader in python to read a csv file into a dictionary. 我在python中使用csv.reader将csv文件读入字典。 The first column of the csv is a date (in one of 2 possible formats) which is read in as a datetime object and becomes the key of the dict , and I also read columns 3 and 4: csv的第一列是日期(采用2种可能的格式之一),它作为datetime对象读取,并成为dict的键,我还阅读了第3列和第4列:

import datetime as dt
import csv
with open(fileInput,'r') as inFile:
    csv_in = csv.reader(inFile)
    try:
        dictData = {(dt.datetime.strptime(rows[0], '%d/%m/%Y %H:%M')): [rows[3], rows[4]]
                        for rows in csv_in}
    except:
        dictData = {(dt.datetime.strptime(rows[0], '%Y-%m-%d %H:%M:%S')): [rows[3], rows[4]]
                        for rows in csv_in}

It works, except that the first date in the file ( 1/7/2012 00:00 ) doesn't appear in the dictionary. 它起作用,除了文件中的第一个日期( 1/7/2012 00:00 )没有出现在字典中。 Do I need to tell csv.reader that the first row is not a header row and if so, how? 我需要告诉csv.reader第一行不是标题行吗?

When you run your try , except statement, it is easy to believe that python will first try something, and if that fails, revert your environment back to the state it was in before the try statement was executed. 当您运行tryexcept语句时,很容易相信python会首先try某些操作,如果失败,请将您的环境恢复到执行try语句之前的状态。 It does not do this. 它不会这样做。 As such, you have to be aware of unintended side effects that might occur from a failed try attempt. 因此,您必须注意try尝试失败可能产生的意外副作用。

What is happening in your case is that the dictionary comprehension calls next(...) on your csv.reader() object ( csv_in ), which returns the first line in the csv file. 在您的情况下,词典理解会在csv.reader()对象( csv_in )上调用next(...) ),这将返回csv文件中的第一行。 You have now used up the first item from the csv.reader() iterator. 现在,您已经用完了csv.reader()迭代器中的第一项。 Remember, Python won't revert to a previous state if the try block fails. 请记住,如果try块失败,Python不会恢复到以前的状态。

An exception is then raised, I'm presuming when the date is in the wrong format. 然后引发异常,我推测日期格式错误。 When the except block then takes over, and calls next(...) on your csv_in object, you then get the second item in the iterator. 然后, except块接管并在csv_in对象上调用next(...)时,您将在迭代器中获得第二项。 The first has already been used. 第一个已经被使用。

A simple change to get around this is to make a copy of the csv iterator object. 解决此问题的一个简单更改是制作csv迭代器对象的副本。

import datetime as dt
import csv
from copy import copy
with open(fileInput,'r') as inFile:
    csv_in = csv.reader(inFile)
    try:
        dictData = {(dt.datetime.strptime(rows[0],'%d/%m/%Y %H:%M')):
                      [rows[3],rows[4]] for rows in copy(csv_in)}
    except ValueError:
        dictData = {(dt.datetime.strptime(rows[0],'%Y-%m-%d %H:%M:%S')):
                      [rows[3],rows[4]] for rows in copy(csv_in)}

Finally, I would recommend against catching a generic Exception . 最后,我建议不要捕获通用的Exception I think you would be wanting to catch a ValueError . 我认为您可能想抓住ValueError

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM