简体   繁体   中英

Using buffered reader for large .csv files, Python

I'm trying to open large .csv files (16k lines+, ~15 columns) in a python script, and am having some issues.

I use the built in open() function to open the file, then declare a csv.DictReader using the input file. The loop is structured like this:

for (i, row) in enumerate(reader):
     # do stuff (send serial packet, read response)

However, if I use a file longer than about 20 lines, the file will open, but within a few iterations I get a ValueError: I/O operation on a closed file.

My thought is that I might be running out of memory (though the 16k line file is only 8MB, and I have 3GB of ram), in which case I expect I'll need to use some sort of buffer to load only sections of the file into memory at a time.

Am I on the right track? Or could there be other causes for the file closing unexpectedly?

edit: for about half the times I run this with a csv of 11 lines, it gives me the ValueError. The error does not always happen at the same line

16k lines is nothing for 3GB Ram, most probably your problem is something else eg you are taking too much time in some other process which interferes with opened file. Just to be sure and anyway for speed when you have 3GB ram , load whole file in memory and then parse eg

import csv
import cStringIO
data = open("/tmp/1.csv").read()
reader = csv.DictReader(cStringIO.StringIO(data))
for row in reader:
    print row

In this at-least you shouldn't get file open error.

csv_reader is faster. Read the whole file as blocks. To avoid the memory leak better to use sub process. from multiprocessing import Process

def child_process(name):
     # Do the Read and Process stuff here.if __name__ == '__main__':
     # Get file object resource.
      .....
     p = Process(target=child_process, args=(resource,))
     p.start()
     p.join()

For more information please go through this link. http://articlesdictionary.wordpress.com/2013/09/29/read-csv-file-in-python/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM