简体   繁体   中英

How can I open a csv file in python, and read one line at a time, without loading the whole csv file in memory?

I have a csv file of size that would not fit in the memory of my machine. So I want to open the csv file and then read it's rows one at a time. I basically want to make a python generator that yields single rows from the csv.

Thanks in advance! :)

with open(filename, "r") as file:
    for line in file:
        doanything()

Python is lazy whenever possible. File objects are generators and do not load the entire file but only one line at a time.

My personal preference for doing this is with csv.DictReader

You set it up as an object, with pointers/parameters, and then to access the file one row at a time, you just iterate over it with next and it returns a dictionary containing the named field key, value pairs in your csv file.

eg

import csv
csvfile = open('names.csv')
my_reader = csv.DictReader(csvfile)

first_row = next(my_reader)

for row in my_reader:
    print ( [(k,v) for k,v in row.items() ] )

csvfile.close()

See the linked docs for parameter usage etc - it's fairly straightforward.

Solution:
You can use chunksize param available in pandas read_csv function

chunksize = 10 ** 6
for chunk in pd.read_csv(filename, chunksize=chunksize):
    print(type(chunk))
    # CODE HERE

set chunksize to 1 and it should take care of your problem statement.

python generator that yields single rows from the csv.

This sounds like you want csv.reader from built-in csv module. You will get one list for each line in file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM