简体   繁体   English

Python-读取大量表格数据的有效方法

[英]Python - Efficient way to read large amounts of tabular data

I have a file containing a large table of numbers, roughly 300 MB in size. 我有一个包含大数字表的文件,大小大约为300 MB。 I want to read this in Python. 我想用Python阅读。

Data looks like this: 数据如下所示:

-200 1 11097.4 16414.2 1
-200 1 11197.4 16414.8 1
-200 1 11297.4 16415.4 1
-200 1 11397.4 16416 1
-200 1 11497.4 16416.5 1
-200 1 11597.4 16417.1 1
-200 1 11697.4 16417.7 1

Python code looks like this: Python代码如下所示:

    with open(filename) as f:
        nrow, ncol= [int(x) for x in next(f).split()] 
        for k in range(2):
            rr = []
            for i in range(nrow+1):
                row = []
                for j in range(ncol+1):
                    a = next(f).split()                     
                    row.append([int(a[0]), int(a[1]), float(a[2]), float(a[4])])
                rr.append(row)          
            summary.append(rr)

This is very slow; 这很慢; it takes about 60 seconds to read the file. 读取文件大约需要60秒钟。 I want to get it down to less than 10 seconds. 我想把时间降到10秒以内。 What's the simplest way to make it a bit faster? 使它更快一点的最简单方法是什么?

I am perfectly happy to change the data file format, if it helps. 如果有帮助,我非常乐意更改数据文件格式。

Use pandas. 使用大熊猫。 This might be a duplicate so also check out these answers 这可能是重复的,所以也请查看这些答案

code.py code.py

import pandas as pd
import numpy as np

df = pd.read_csv("large_file.txt", sep="\s")
np.save("large_file.npz", df.values)

with load('large_file.npz') as data:
    print(data.shape)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用python将大量数据插入MySQL表的有效方法是什么? - What is an efficient way to insert large amounts of data into a MySQL table using python? 使用 Python 将每日大量数据导入 csv 的最佳方式 - Best way to import daily large amounts of data to a csv with Python 导入大型数据文件的有效方法,Python - Efficient way to import large data file, Python 在python中使用数字数据读取csv的有效方法 - efficient way to read csv with numeric data in python 正确加载大量图像数据的方法 - Proper way of loading large amounts of image data 读取/写入/解析大型文本文件的有效方法(python) - Efficient way to read/write/parse large text files (python) 有没有一种真正有效(快速)的方式来读取 python 中的大文本文件? - Is there a really efficient (FAST) way to read large text files in python? 读取大型二进制文件python的最有效方法是什么 - What is the most efficient way to read a large binary file python 有没有比通过数组更有效的方法来处理大量数据? - Is there a more efficient method to process large amounts of data than through arrays? 如何制作一个测试大量数据的高效程序 - How to make an efficient program that test very large amounts of data
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM