简体   繁体   English

如何在 python 中打开一个 csv 文件,一次读取一行,而不将整个 csv 文件加载到内存中?

[英]How can I open a csv file in python, and read one line at a time, without loading the whole csv file in memory?

I have a csv file of size that would not fit in the memory of my machine.我有一个大小不适合我机器内存的 csv 文件。 So I want to open the csv file and then read it's rows one at a time.所以我想打开 csv 文件,然后一次读取它的行。 I basically want to make a python generator that yields single rows from the csv.我基本上想制作一个从csv生成单行的python生成器。

Thanks in advance!提前致谢! :) :)

with open(filename, "r") as file:
    for line in file:
        doanything()

Python is lazy whenever possible.只要有可能,Python 就是懒惰的。 File objects are generators and do not load the entire file but only one line at a time.文件对象是生成器,不会加载整个文件,而是一次只加载一行。

My personal preference for doing this is with csv.DictReader我个人更喜欢使用csv.DictReader

You set it up as an object, with pointers/parameters, and then to access the file one row at a time, you just iterate over it with next and it returns a dictionary containing the named field key, value pairs in your csv file.您将其设置为带有指针/参数的对象,然后一次访问文件一行,您只需使用next对其进行迭代,它会返回一个字典,其中包含 csv 文件中的命名字段键、值对。

eg例如

import csv
csvfile = open('names.csv')
my_reader = csv.DictReader(csvfile)

first_row = next(my_reader)

for row in my_reader:
    print ( [(k,v) for k,v in row.items() ] )

csvfile.close()

See the linked docs for parameter usage etc - it's fairly straightforward.有关参数使用等信息,请参阅链接的文档 - 这相当简单。

Solution:解决方案:
You can use chunksize param available in pandas read_csv function您可以使用 pandas read_csv 函数中可用的chunksize参数

chunksize = 10 ** 6
for chunk in pd.read_csv(filename, chunksize=chunksize):
    print(type(chunk))
    # CODE HERE

set chunksize to 1 and it should take care of your problem statement.chunksize设置为 1,它应该会处理您的问题陈述。

python generator that yields single rows from the csv.从 csv 生成单行的 python 生成器。

This sounds like you want csv.reader from built-in csv module.这听起来像是您想要来自内置csv模块的csv.reader You will get one list for each line in file.您将获得文件中每一行的一个列表。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用python转置/枢转csv文件,而无需将整个文件加载到内存中? - How do I transpose/pivot a csv file with python *without* loading the whole file into memory? 编辑.csv而不读取整个文件(python) - Edit a .csv without read the whole file (python) 通过 FTP 逐行读取 CSV 而不将整个文件存储在内存/磁盘中 - Read CSV over FTP line by line without storing the whole file in memory/disk 逐行读取 XML 而无需将整个文件加载到 memory - Read XML line by line without loading whole file to memory 关于打开csv文件并逐行读取并随机选择python - about open csv file and read line by line in and choice randomly python 如何在不将整个文件加载到内存中的情况下更改大型(60gig)csv文件的列名? - How can I alter the name of a column for a large (60gig) csv file without loading the entire file in memory? 如果csv文件的最后一行在Python中只有1列,我怎么不读它呢? - How can I not read the last line of a csv file if it has simply 1 column in Python? 如何将CSV文件的最后一行读入我可以操作的列表(python) - How to read last line of CSV file into a list that I can manipulate (python) 如何在 python 中读取和排序 csv 文件? - How can I read and sort a csv file in python? 在Python中,如何打开文件并在一行中读取它,之后仍能关闭文件? - In Python, how can I open a file and read it on one line, and still be able to close the file afterwards?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM