如何在 Python 中实现一个高效的素数无限生成器？

Question

这不是作业，我只是好奇。

无限是这里的关键词。

我希望将它用作for p in primes() 。 我相信这是 Haskell 中的一个内置函数。

所以，答案不能像“做一个筛子”那样天真。

首先，你不知道会消耗多少连续的素数。 好吧，假设您一次可以制作 100 个。 您会使用相同的 Sieve 方法以及素数的频率公式吗？

我更喜欢非并发方法。

感谢您阅读（和写作；））！

Answer 1

“如果我看得更远……”

食谱中的erat2函数可以进一步加速（大约 20-25%）：

时代2a

import itertools as it
def erat2a( ):
    D = {  }
    yield 2
    for q in it.islice(it.count(3), 0, None, 2):
        p = D.pop(q, None)
        if p is None:
            D[q*q] = q
            yield q
        else:
            # old code here:
            # x = p + q
            # while x in D or not (x&1):
            #     x += p
            # changed into:
            x = q + 2*p
            while x in D:
                x += 2*p
            D[x] = p

not (x&1)检查验证x是否为奇数。 然而，由于两个q和p是奇数，通过添加2*p的一半的步骤与测试古怪沿着避免。

时代3

如果不介意一些额外的erat2可以通过以下更改将erat2加速 35-40%（注意：需要 Python 2.7+ 或 Python 3+，因为itertools.compress函数）：

import itertools as it
def erat3( ):
    D = { 9: 3, 25: 5 }
    yield 2
    yield 3
    yield 5
    MASK= 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0,
    MODULOS= frozenset( (1, 7, 11, 13, 17, 19, 23, 29) )

    for q in it.compress(
            it.islice(it.count(7), 0, None, 2),
            it.cycle(MASK)):
        p = D.pop(q, None)
        if p is None:
            D[q*q] = q
            yield q
        else:
            x = q + 2*p
            while x in D or (x%30) not in MODULOS:
                x += 2*p
            D[x] = p

erat3函数利用了这样一个事实，即所有素数（除了erat3 ）对 30 取模后只有 8 个数字：包含在MODULOS冻结MODULOS数字。 因此，在产生最初的三个素数后，我们从 7 开始，只处理候选者。
候选过滤使用itertools.compress函数； “魔法”在MASK序列中； MASK有 15 个元素（每 30 个数字中有 15 个奇数，由itertools.islice函数选择），每个可能的候选者为1 ，从 7 开始。循环按照itertools.cycle函数的指定重复。
引入候选过滤需要另外修改： or (x%30) not in MODULOS check。 erat2算法处理所有奇数； 现在， erat3算法只处理 r30 个候选，我们需要确保所有D.keys()只能是这种“假”候选。

基准

结果

在 Atom 330 Ubuntu 9.10 服务器上，版本 2.6.4 和 3.1.1+：

$ testit
up to 8192
==== python2 erat2 ====
100 loops, best of 3: 18.6 msec per loop
==== python2 erat2a ====
100 loops, best of 3: 14.5 msec per loop
==== python2 erat3 ====
Traceback (most recent call last):
…
AttributeError: 'module' object has no attribute 'compress'
==== python3 erat2 ====
100 loops, best of 3: 19.2 msec per loop
==== python3 erat2a ====
100 loops, best of 3: 14.1 msec per loop
==== python3 erat3 ====
100 loops, best of 3: 11.7 msec per loop

在 AMD Geode LX Gentoo 家庭服务器上，Python 2.6.5 和 3.1.2：

$ testit
up to 8192
==== python2 erat2 ====
10 loops, best of 3: 104 msec per loop
==== python2 erat2a ====
10 loops, best of 3: 81 msec per loop
==== python2 erat3 ====
Traceback (most recent call last):
…
AttributeError: 'module' object has no attribute 'compress'
==== python3 erat2 ====
10 loops, best of 3: 116 msec per loop
==== python3 erat2a ====
10 loops, best of 3: 82 msec per loop
==== python3 erat3 ====
10 loops, best of 3: 66 msec per loop

基准代码

primegen.py模块包含了erat2 、 erat2a和erat3函数。 下面是测试脚本：

#!/bin/sh
max_num=${1:-8192}
echo up to $max_num
for python_version in python2 python3
do
    for function in erat2 erat2a erat3
    do
        echo "==== $python_version $function ===="
        $python_version -O -m timeit -c \
        -s  "import itertools as it, functools as ft, operator as op, primegen; cmp= ft.partial(op.ge, $max_num)" \
            "next(it.dropwhile(cmp, primegen.$function()))"
    done
done

Answer 2

由于 OP 要求有效的实现，这里是 David Eppstein/Alex Martelli 对活动状态 2002 代码的重大改进（见他的回答）：不要在字典中记录素数的信息，直到它的平方出现在候选人。 将空间复杂度降低到O(sqrt(n)) 以下而不是O(n) ，对于产生的 n 个素数 ( π(sqrt(n log n)) ~ 2 sqrt(n log n) / log(n log n) ~ 2 sqrt(n / log n) )。 因此，时间复杂度也得到改善，即它运行得更快。

创建一个“滑动筛”作为每个基本素数的当前倍数的字典（即低于当前生产点的 sqrt），以及它们的步长值：

from itertools import count
                                         # ideone.com/aVndFM
def postponed_sieve():                   # postponed sieve, by Will Ness      
    yield 2; yield 3; yield 5; yield 7;  # original code David Eppstein, 
    sieve = {}                           #   Alex Martelli, ActiveState Recipe 2002
    ps = postponed_sieve()               # a separate base Primes Supply:
    p = next(ps) and next(ps)            # (3) a Prime to add to dict
    q = p*p                              # (9) its sQuare 
    for c in count(9,2):                 # the Candidate
        if c in sieve:               # c's a multiple of some base prime
            s = sieve.pop(c)         #     i.e. a composite ; or
        elif c < q:  
             yield c                 # a prime
             continue              
        else:   # (c==q):            # or the next base prime's square:
            s=count(q+2*p,2*p)       #    (9+6, by 6 : 15,21,27,33,...)
            p=next(ps)               #    (5)
            q=p*p                    #    (25)
        for m in s:                  # the next multiple 
            if m not in sieve:       # no duplicates
                break
        sieve[m] = s                 # original test entry: ideone.com/WFv4f

（此处较旧的原始代码经过编辑以合并更改，如下面Tim Peters 的回答所示）。 又见这对于相关讨论。

类似的基于轮子的2-3-5-7代码运行速度提高了约 2.15 倍（这非常接近3/2 * 5/4 * 7/6 = 2.1875的理论改进）。

Answer 3

为了后代，这里重写了Will Ness为 Python 3 编写的漂亮算法。需要进行一些更改（迭代器不再具有.next()方法，但有一个新的next()内置函数）。 其他更改是为了好玩（使用yield from <iterable>的新yield from <iterable>替换了原来的四个yield语句。更多是为了可读性（我不喜欢过度使用 ;-) 1 个字母的变量名）。

它比原来快得多，但不是出于算法原因。 加速主要是由于删除了原始的add()函数，而是执行内联。

def psieve():
    import itertools
    yield from (2, 3, 5, 7)
    D = {}
    ps = psieve()
    next(ps)
    p = next(ps)
    assert p == 3
    psq = p*p
    for i in itertools.count(9, 2):
        if i in D:      # composite
            step = D.pop(i)
        elif i < psq:   # prime
            yield i
            continue
        else:           # composite, = p*p
            assert i == psq
            step = 2*p
            p = next(ps)
            psq = p*p
        i += step
        while i in D:
            i += step
        D[i] = step

Answer 4

这最初不是我的代码，但是，值得发布。 原文可以在这里找到： http : //code.activestate.com/recipes/117119/

def gen_primes():
  D = {}
  q = 2  # first integer to test for primality.

  while True:
    if q not in D:
      # not marked composite, must be prime  
      yield q 

      #first multiple of q not already marked
      D[q * q] = [q] 
    else:
      for p in D[q]:
        D.setdefault(p + q, []).append(p)
      # no longer need D[q], free memory
      del D[q]

    q += 1

它是一个生成器，因此可以像使用其他生成器一样使用它。

primes = gen_primes()
for p in primes:
  print p

在我的桌面上生成并放入一组 100 万个素数需要 1.62 秒。

Answer 5

进行分段筛选，其中分段的大小由可用内存或位集的最大大小决定。

对于每个段代表某个区间内的数字[n; n + segment_size) 作为位集并筛选所有低于上限平方根的素数。

使用位集比哈希表或树数据结构使用更少的内存，因为您正在处理密集的数字集。

Answer 6

这是一个非常快的无限生成器，用 Python2 编写，但很容易适应 Python3。 要使用它来添加最多 10**9 的素数，请使用以下命令：

from itertools import takewhile
from functools import partial
from operator import gt
print (sum(takewhile(partial(gt, 10**9), prime_gen_inf())))

这是一个分段筛选器，比 Will Ness 的算法更快但显然不够优雅。

from operator import mul
from functools import reduce
def prod(x): return reduce(mul, x, 1)


def build_sieve(wheel):
    w = prod(wheel)
    w_phi = prod([p-1 for p in wheel])
    rems = [a for a in range(w) if all(a % p for p in wheel)]
    assert len(rems) == w_phi
    inv = {a:pow(a, w_phi - 1, w) for a in rems}
    try:
        known_p = wheel + rems[1 : rems.index(rems[1]*rems[1])]
    except ValueError:
        known_p = wheel + rems[1:]
    return wheel, w, w_phi, rems, inv, known_p

#Adjust the chunk variable based on your computer's architecture.
#
#Adjust the line with #! if you don't need "true" infinite.  If you don't need
#primes larger than 1<<32, use array('H', []), if 1<<64 use 'L', if 1<<128 (in
#Python3) use 'Q', otherwise use empty list [].
#To save memory, comment out the lines with #*, and uncomment the commented-out
#lines 
import itertools
from itertools import islice, count, compress, izip
chain_f = itertools.chain.from_iterable
from array import array
def prime_gen_inf(chunk=250000, sieve_info = build_sieve([2,3,5,7])):
    """    Indefinitely yields primes    """
    wheel, w, w_phi, rems, inv, known_p = sieve_info
    for p in known_p: yield p
    new_n = 0;
    while True:
        size = min(chunk, (p * p - new_n) / w)
        sieve = bytearray([1]) * size * w_phi
        n, new_n = new_n, new_n + size * w
        if not n:
            zero = bytearray([0])
            seen = len(known_p) - len(wheel) + 1
            sieve[:seen:1] = zero * seen
            p_gen = islice(prime_gen_inf(), len(wheel), None)
            new_p = next(p_gen)
            ps = []                                         #! array('H', [])
            p_invs = bytearray([])                                         #*
        while new_p * new_p < new_n:
            ps.append(new_p)
            p_invs.append(inv[new_p % w])                                  #*
            new_p = next(p_gen)
        for p, p_inv, modp in izip(ps, p_invs, [-n % p for p in ps]):      #*
            s = [(modp + p * (p_inv * (r - modp) % w)) / w for r in rems]  #*
        #for p in ps:
        #    s = [(-n%p + p * (inv[p%w] * (r - -n%p) % w)) / w for r in rems]
            for i, start in enumerate(s):
                slice_size = ((size - start - 1) / p + 1)
                sieve[i + start * w_phi :: p * w_phi] = zero * slice_size
        for p in compress(chain_f(izip(*[count(n+r, w) for r in rems])), sieve):
            yield p

Answer 7

另一个答案，比我的erat3答案更节省内存：

import heapq

def heapprimegen():
    hp= []
    yield 2
    yield 3
    cn= 3
    nn, inc= 3, 6
    while 1:
        while cn < nn:
            yield cn
            heapq.heappush(hp, (3*cn, 2*cn))
            cn+= 2
        cn= nn+2
        nn, inc= heapq.heappushpop(hp, (nn+inc, inc))

它维护一个质数倍数的堆（一个列表）而不是一个字典。 显然，它失去了一些速度。

Answer 8

另一种方法：

import itertools
def primeseq():
    prime = [2]
    num = 0
    yield 2
    for i in itertools.count(3, 2):
        is_prime = True
        for num in prime:
            if i % num == 0:
                is_prime = False
                break
            elif num ** 2 > i: 
                break
        if is_prime:
            prime.append(i)
            yield i

Answer 9

这是一个使用堆而不是字典的简单但不是很慢的方法：

import heapq

def heap_prime_gen_squares(): 
    yield 2  
    yield 3  
    h = [(9, 6)]
    n = 5
    while True:
        a, b = h[0]
        while n < a:
            yield n
            heapq.heappush(h, (n * n, n << 1))
            n += 2
        heapq.heapreplace(h, (a + b, b))  # Replace h[0], which is still (a, b).

我对前 100 万个素数的用户时间的速度测量（数字越小越好）：

延迟筛（基于字典）：8.553s
erat2b（基于字典）：9.513s
erat2a（基于字典）：10.313s
heap_prime_gen_smallmem（基于堆）：23.935s
heap_prime_gen_squares（基于堆）：27.302s
heapprimegen（基于字典）：145.029s

所以基于 dict 的方法似乎是最快的。

Answer 10

这是一个复杂的基于堆的实现，它并不比其他基于堆的实现快多少（参见我另一个答案中的速度比较），但它使用的内存要少得多。

此实现使用两个堆（tu 和 wv），它们包含相同的数字元素。 每个元素都是一个 int 对。 为了找到直到q**2所有素数（其中q是素数），每个堆将最多包含2*pi(q-1)元素，其中pi(x)是不大于x 。 所以整数的总数最多为4*pi(floor(sqrt(n))) 。 （通过将一半的内容推送到堆，我们可以在内存上获得 2 的因子，但这会使算法变慢。）

上面的其他基于 dict 和堆的方法（例如，erat2b、heap_prime_gen_squares 和 heaprimegen）存储大约“2*pi(n)”整数，因为它们每次找到素数时都会扩展它们的堆或 dict。 作为比较：为了找到 1_000_000 个素数，这个实现存储少于 4141 个整数，其他实现存储超过 1_000_000 个整数。

import heapq

def heap_prime_gen_smallmem():
    yield 2
    yield 3
    f = 5
    fmar3 = 2
    q = 7
    q6 = 7 * 6
    qmar3 = 4
    tu = [(25, 30), (35, 30)]
    vw = [(25, 30), (35, 30)]
    while True:
        qmar3 += 2   
        if qmar3 == 6:  
            qb = q + 4
            q6b = q6 + 24
            qmar3 = 2
        else:
            qb = q + 2
            q6b = q6 + 12
        if q < tu[0][0]:
            d = q * q
            while f < d:
                a, b = vw[0]
                if f < a: 
                    yield f   
                else:
                    a, b = vw[0]
                    heapq.heapreplace(vw, (a + b, b))
                    a, b = vw[0]
                    while f >= a:
                        heapq.heapreplace(vw, (a + b, b))
                        a, b = vw[0]   
                fmar3 += 2
                if fmar3 == 6:
                    f += 4
                    fmar3 = 2
                else:
                    f += 2
            c = q * qb   
            heapq.heappush(tu, (d, q6))
            heapq.heappush(tu, (c, q6))
            heapq.heappush(vw, (d, q6))
            heapq.heappush(vw, (c, q6))
        else:
            a, b = tu[0]
            heapq.heapreplace(tu, (a + b, b))
            a, b = tu[0]  
            while q >= a:
                heapq.heapreplace(tu, (a + b, b))
                a, b = tu[0]
        q = qb
        q6 = q6b

Answer 11

这是一个生成器，它更符合 Haskell 中的做法：过滤已知素数的组合，然后将剩余的素数添加到列表中。

def gen_primes():
    primes = []
    i = 2
    while True:
        prime = True
        for p in primes:
            if not (i % p):
                prime = False
                break
        if prime:
            yield i
            primes.append(i)
        i += 1

Answer 12

前段时间我写了一篇关于无限素数生成器的文章：

http://stacktrace.it/2008/01/progetto-eulero-problema-3/

它是意大利语，但您可能会使用 Google 进行令人讨厌的翻译： http : //tinyurl.com/yzpyeom

Answer 13

我知道这篇文章很旧，但我自己遇到了这个问题......以下代码基于一个非常简单的想法：越来越多的 Eratosthenes 筛子。 这个解决方案确实比这里最好的解决方案慢，但它很容易掌握并且设计为可读......

我使用整数来存储筛子的结果。 在二进制格式，整数是列表0秒和1 S， 0在位置i如果i不是素， 1如果它可能是一个素数。 必要的无穷大是 Python 3 整数无界这一事实的结果。

def primes():
    container, size = 1 << 2, 3 # we start with 0b100 (from right to left: 0 and 1 are not primes, 2 is
    last_prime = 1
    while True:
        prime = next((j for j in range(last_prime+1, size) if container & 1 << j), None) # find the next prime
        while not prime:
            container, size = expand(container, size, 2**16) # add 65536 cells and sieve the container
            prime = next((j for j in range(last_prime+1, size) if container & 1 << j), None)
        yield prime
    last_prime = prime

如何扩展容器？ 只需在容器的左侧（以二进制格式）添加一堆1并筛选它们。 这与标准筛相同，略有不同。 在标准筛，如果我们找到一个素i ，我们开始穿越细胞在i*i ，用的步骤i 。

在这里，这可能已经为容器的第一部分完成。 如果它比i*i更远，我们只需要从容器的新部分的开始处开始。

def expand(container, size, n):
    new_size = size + n
    container += (1 << (new_size + 1) - 1) - (1 << size) # add n 1's
    for i in range(2, new_size):
        if container & (1 << i): # i is a prime
            t = sum(1 << j for j in range(max(i, size // i)*i, new_size, i)) # set 1 for all mutiple
            container &= ~t # cross the cells

    return container, new_size

测试一百万个素数：

import itertools
assert 78498 == len(list(itertools.takewhile(lambda p: p<1000000, primes())))

如何在 Python 中实现一个高效的素数无限生成器？

问题描述

12 个解决方案

解决方案1
78 2010-09-26 03:01:48

“如果我看得更远……”

时代2a

时代3

基准

结果

基准代码

解决方案2
73 2012-05-24 08:16:32

解决方案3
45 2013-10-15 21:08:31

解决方案4
8 2010-02-06 04:09:18

解决方案5
5 2010-02-06 10:47:48

解决方案6
2 2015-11-05 21:42:36

解决方案7
2 2011-12-05 13:44:11

解决方案8
2 2012-02-05 18:42:03

解决方案9
1 2012-08-01 12:52:24

解决方案10
1 2012-08-01 13:03:40

解决方案11
0 2010-02-06 04:22:27

解决方案12
0 2010-02-06 12:43:18

解决方案13
0 2018-05-17 19:33:26

如何在 Python 中实现一个高效的素数无限生成器？

问题描述

12 个解决方案

解决方案1 78 2010-09-26 03:01:48

“如果我看得更远……”

时代2a

时代3

基准

结果

基准代码

解决方案2 73 2012-05-24 08:16:32

解决方案3 45 2013-10-15 21:08:31

解决方案4 8 2010-02-06 04:09:18

解决方案5 5 2010-02-06 10:47:48

解决方案6 2 2015-11-05 21:42:36

解决方案7 2 2011-12-05 13:44:11

解决方案8 2 2012-02-05 18:42:03

解决方案9 1 2012-08-01 12:52:24

解决方案10 1 2012-08-01 13:03:40

解决方案11 0 2010-02-06 04:22:27

解决方案12 0 2010-02-06 12:43:18

解决方案13 0 2018-05-17 19:33:26

解决方案1
78 2010-09-26 03:01:48

解决方案2
73 2012-05-24 08:16:32

解决方案3
45 2013-10-15 21:08:31

解决方案4
8 2010-02-06 04:09:18

解决方案5
5 2010-02-06 10:47:48

解决方案6
2 2015-11-05 21:42:36

解决方案7
2 2011-12-05 13:44:11

解决方案8
2 2012-02-05 18:42:03

解决方案9
1 2012-08-01 12:52:24

解决方案10
1 2012-08-01 13:03:40

解决方案11
0 2010-02-06 04:22:27

解决方案12
0 2010-02-06 12:43:18

解决方案13
0 2018-05-17 19:33:26