简体   繁体   English

将巨大的Python列表数据导出到文本文件的最快方法

[英]Fastest way to export data of huge Python lists to a text file

I am searching for the most performant way to export the elements of up to ten Python lists [x1, x2, x3, ... xn], [y1, y2, y3, ... yn], [z1, z2, z3, ... zn], ... to a text file with a structure as follows: 我正在寻找导出最多十个Python列表[x1, x2, x3, ... xn], [y1, y2, y3, ... yn], [z1, z2, z3, ... zn], ...转换为具有以下结构的文本文件:

x1 y1 z1  .  .  . 
x2 y2 z2  .  .  .
x3 y3 z3  .  .  .
 .  .  .  .  .  .
 .  .  .  .  .  .
 .  .  .  .  .  .
xn yn zn  .  .  .

What makes it challenging is that each list may have up to 1 million elements. 具有挑战性的是,每个列表可能包含多达一百万个元素。

Any suggestions are highly appreciated. 任何建议都将受到高度赞赏。

Use the csv module and the writerows function to write the list of lists in one line. 使用csv模块和writerows函数将列表列表写在一行中。

Small standalone test: 小型独立测试:

import random,time


lists = [[random.randint(1,500) for _ in range(100000)] for _ in range(100)]

import csv
start_time=time.time()

with open("out.csv","w",newline="") as f:
    cw = csv.writer(f,delimiter=" ")
    cw.writerows(lists)

print(time.time()-start_time)

writes 100 lines of 100000 elements in 2 seconds on my machine (generating the list was slower than writing them back) 在2秒钟内在我的机器上写100行100000个元素(生成列表比写回它们慢)

So you're just limited by the memory of your input list. 因此,您仅受输入列表内存的限制。

EDIT: this code above does not "transpose" properly so it's cheating. 编辑:上面的这段代码不能正确地“转置”,因此是作弊行为。 Using zip (python 3) does the trick directly using writerows so the code doesn't change much: 使用zip (python 3)可以直接使用writerows完成技巧,因此代码不会有太大变化:

import random,time

n=1000000
list1 = list(range(1,n))
list2 = list(range(n+1,n*2))
list3 = list(range(2*n+1,n*3))

import csv
start_time=time.time()

with open("out.csv","w",newline="") as f:
    cw = csv.writer(f,delimiter=" ")
    cw.writerows(zip(list1,list2,list3))

print(time.time()-start_time)

for python2, use itertools.izip because zip returns a list: not memory-efficient. 对于python2,请使用itertools.izip因为zip返回一个列表:不节省内存。 Python 2 compliant code: 符合Python 2的代码:

import itertools
with open("out.csv","wb") as f:
    cw = csv.writer(f,delimiter=" ")
    cw.writerows(itertools.izip(list1,list2,list3))

If you have a list of lists: 如果您有一个列表列表:

list_of_lists = [list1,list2,list3]

you can use * to expand the list into arguments for zip or izip : 您可以使用*将列表扩展为zipizip参数:

cw.writerows(zip(*lists_of_lists))

cw.writerows(itertools.izip(*lists_of_lists))

You can do something like this: 您可以执行以下操作:

from itertools import izip
import csv

with open('new_file', 'w') as f:
    writer = csv.writer(f, delimiter=' ')
    for a in izip(l1, l2, ....., l10):
        writer.writerow(a)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM