简体   繁体   English

使用python和numpy加载文件的最快方法是什么?

[英]What is the fastest way to load file using python and numpy?

I want to train a model and I have a big dataset for training. 我想训练一个模型,但是我有一个很大的训练数据集。 Its size is more than 20gb. 它的大小超过20GB。 But when I try to read it, it took so long time. 但是当我尝试阅读它时,花费了很长时间。 I mean to load it on memory. 我的意思是将其加载到内存中。

with open(file_path, newline='', encoding='utf-8') as f:
    reader = csv.reader(f)
    for i,row in enumerate(islice(reader,0,1)):
        train_data = np.array(makefloat(row))[None,:]
    for i,row in enumerate(reader):
        train_data = np.vstack((train_data,np.array(makefloat(row))[None,:]))

It has 43 floats for each line. 每行有43个浮点数。

It took so long time, I tested it for just 100,000 lines and it took 20 mins. 它花了很长时间,我只测试了100,000条线,花了20分钟。

I think I'm doing wrong. 我想我做错了。 How can I make it faster? 我怎样才能使其更快?

Its' not good to read the entire file. 读取整个文件不是很好。 You can use something like Dask which will read your file in chunks and will be faster. 您可以使用诸如Dask之类的工具来读取文件,并且速度更快。 Dask 达斯克

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用Python将大型CSV文件批量加载到MSSQL的最快方法是什么? - What is the fastest way to bulk load a large CSV file into MSSQL using Python? 在两个python numpy数组中检查条件的最快方法是什么? - What is fastest way to check conditions in two python numpy arrays? 在Python 2.7中保存/加载大型列表的最快方法是什么? - What's the fastest way to save/load a large list in Python 2.7? 与使用python的大文件B相比,从大文件A中查找唯一行的最快方法是什么? - What's the fastest way to find unique lines from huge file A as compared to huge file B using python? 采样numpy数组的最快方法是什么? - What is the fastest way to sample slices of numpy arrays? 阈值numpy数组的最快方法是什么? - What's the fastest way to threshold a numpy array? 在循环中堆叠 numpy arrays 的最快方法是什么? - What is the fastest way to stack numpy arrays in a loop? 从MySQL加载数字数据到python / pandas / numpy数组的最快方法 - Fastest way to load numeric data into python/pandas/numpy array from MySQL 替换字典中文件行的最快方法是什么? - python - What is the fastest way to replace lines of a file from a dictionary? 使用 Python 在文本中搜索正则表达式列表的最快方法是什么? - What is the fastest way of searching a list of regexs in a text using Python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM