[英]How to quickly generating a list of all pairs from a large set of numbers?
I create a list with numbers from 0 to 131072: 我创建了一个数字从0到131072的列表:
x = [i for i in range(131072)]
Then all pairs, except for pairs of the same numbers: 然后所有对,除了相同编号的对:
pairs = []
append_pairs = pairs.append
for i in range(len(x)):
for j in range(len(x)):
if x[i]!=x[j]:
x2 = [x[i], x[j]]
append_pairs(x2)
which gives: 这使:
pairs = [[0, 1], [0, 2], [0, 3], ... [131071, 131070]]
But in this syntax it takes a very very long time. 但是使用这种语法需要花费非常长的时间。 Can it be done faster?
可以更快地完成吗?
You can use itertools.combinations
but that will probably also take a little while like so: 您可以使用
itertools.combinations
但这可能会花费一些时间,如下所示:
import itertools as it
n = 131072
pairs = it.combinations(range(n), 2)
Note that the code above will not give you the list of all pairs but a generator over pairs: 请注意,上面的代码不会为您提供所有对的列表,而是一个成对的生成器:
>>> pairs
<itertools.combinations at 0x7fb939a72a48>
You can get the list using 您可以使用以下方式获取列表
pairs = list(it.combinations(range(n), 2)
Using numpy is probably faster: 使用numpy可能更快:
import numpy as np
pairs = np.transpose(np.triu_indices(n, 1))
However, the number of pairs you want to generate is enormous and you cannot store the numbers in memory (unless you have a very powerful machine). 但是,您要生成的对数非常多,并且您无法将数对存储在内存中(除非您有一台非常强大的计算机)。 In particular, you get
n * (n - 1) / 2
pairs. 特别是,您将获得
n * (n - 1) / 2
对。 If you store the numbers as 8-byte integers, you're looking at just under 70 GB of memory. 如果将数字存储为8字节整数,则表示内存不足70 GB。
For n = 5000
: 对于
n = 5000
:
Note: Because there is more in-built code available, I have generated distinct pairs. 注意:因为有更多内置代码可用,所以我生成了不同的对。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.