简体   繁体   English

Python,而不是一个包含4000万个连续数字的列表,我如何制作40个100万个连续数字列表的2-d列表?

[英]Python, instead of a list of 40 million consecutive numbers how would I make a 2-d list of 40, size 1 million, consecutive number lists?

I am working on a MapReduce algorithm currently and I need to build my data source a little better. 我目前正在研究MapReduce算法,我需要更好地构建我的数据源。 This program is to give a list of nonces to use in a hash algorithm to find "good" (low value) hashes; 该程序用于给出在哈希算法中使用的随机数列表,以找到“好”(低值)哈希值; very similar to bitcoin. 与比特币非常相似。 Right now I make a single list of 40 million consecutive numbers (nonces). 现在我制作一个包含4000万个连续数字(nonce)的列表。 But the overhead in IO (using mincemeat.py) is making the program very slow. 但IO中的开销(使用mincemeat.py)使程序变得非常慢。

Currently I am using this to create my list 目前我正在使用它来创建我的列表

#Build the data source
nonces = [i for i in range(0, 400000)]
#Create a dict with a single entry
datasource = dict(enumerate(nonces))

How could I alter the first line of code to create a list of size 40, containing lists of size 1 million; 我怎么能改变第一行代码来创建一个大小为40的列表,其中包含大小为100万的列表; so the first list would be 1-1mil, then 1mil to 2mil, etc? 所以第一个列表是1-1mil,然后是1mil到2mil等等? Do I need to break down and make a for loop, or is there a simple one liner I could implement to achieve this? 我是否需要分解并制作for循环,或者是否有一个简单的衬垫我可以实现这个?

Here is how I would implement the for loop to do it, can it be condensed? 这是我如何实现for循环来实现它,它可以被浓缩吗? (I know I have repeating numbers..) (我知道我有重复的数字..)

nonceList = []
for j in range(0, 40):
    nonceList.append([i for i in range(j*1000000, (j+1)*1000000)])
datasource = dict(enumerate(nonceList))

Don't produce consecutive numbers up front; 不要预先生成连续的数字; just have your mapreduce tasks produce them from a starter number. 只需让你的mapreduce任务从起始编号生成它们。

Eg for 40 tasks, number those 0-40 and use a multiplier to generate numbers in the task; 例如,对于40个任务,编号为0-40并使用乘数在任务中生成数字; in Python 2, use xrange() to generate numbers, as range() will produce a list, materializing a million integer objects for no gain. 在Python 2中,使用xrange()生成数字,因为range()将生成一个列表,实现一百万个整数对象而无法获得。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM