简体   繁体   English

Haskell 中的初筛

[英]Prime Sieve in Haskell

I'm very new to Haskell and I'm just trying to find the sum of the first 2 million primes.我对 Haskell 很陌生,我只是想找到前 200 万个素数的总和。 I'm trying to generate the primes using a sieve (I think the sieve of Eratosthenes?), but it's really really slow and I don't know why.我正在尝试使用筛子生成素数(我认为是 Eratosthenes 的筛子?),但它真的很慢,我不知道为什么。 Here is my code.这是我的代码。

sieve (x:xs) = x:(sieve $ filter (\a -> a `mod` x /= 0) xs)
ans = sum $ takeWhile (<2000000) (sieve [2..])

Thanks in advance.提前致谢。

It is very slow because the algorithm is a trial division that doesn't stop at the square root.它非常慢,因为该算法是一个不停留在平方根的试除法。

If you look closely what the algorithm does, you see that for each prime p , its multiples that have no smaller prime divisors are removed from the list of candidates (the multiples with smaller prime divisors were removed previously).如果您仔细查看算法的作用,您会看到对于每个素数p ,其不具有较小素因数的倍数将从候选列表中删除(具有较小素因数的倍数先前已被删除)。

So each number is divided by all primes until either it is removed as a multiple of its smallest prime divisor or it appears at the head of the list of remaining candidates if it is a prime.因此,每个数字都被所有素数除,直到它作为其最小素数除数的倍数被删除,或者如果它是素数,它出现在剩余候选者列表的开头。

For the composite numbers, that isn't particularly bad, since most composite numbers have small prime divisors, and in the worst case, the smallest prime divisor of n doesn't exceed √n .对于合数来说,这并不是特别糟糕,因为大多数合数都有小的素数除数,在最坏的情况下, n的最小素数除数不超过√n

But the primes are divided by all smaller primes, so until the k th prime is found to be prime, it has been divided by all k-1 smaller primes.但是素数被所有较小的素数整除,所以直到第 k素数被发现是素数,它已经被所有k-1较小的素数整除。 If there are m primes below the limit n , the work needed to find all of them prime is如果有m个质数低于限制n ,则找到所有质数所需的工作是

(1-1) + (2-1) + (3-1) + ... + (m-1) = m*(m-1)/2

divisions.师。 By the Prime number theorem , the number of primes below n is asymptotically n / log n (where log denotes the natural logarithm).根据素数定理n以下的素数是渐近n / log n (其中log表示自然对数)。 The work to eliminate the composites can crudely be bounded by n * √n divisions, so for not too small n that is negligible in comparison to the work spent on the primes.消除复合物的工作可以粗略地受到n * √n个除法的限制,因此与花费在素数上的工作相比, n不会太小可以忽略不计。

For the primes to two million, the Turner sieve needs roughly 10 10 divisions.对于 200 万的素数,特纳筛需要大约 10 10格。 Additionally, it needs to deconstruct and reconstruct a lot of list cells.此外,它需要解构和重建大量列表单元格。

A trial division that stops at the square root,止于平方根的试除法,

isPrime n = go 2
  where
    go d
      | d*d > n        = True
      | n `rem` d == 0 = False
      | otherwise      = go (d+1)

primes = filter isPrime [2 .. ]

would need fewer than 1.9*10 9 divisions (brutal estimate if every isPrime n check went to √n - actually, it takes only 179492732 because composites are generally cheap) (1) and much fewer list operations.将需要少于 1.9*10 9个除法(粗略估计,如果每个isPrime n检查到√n - 实际上,它只需要 179492732,因为复合材料通常很便宜) (1)和更少的列表操作。 Additionally, this trial division is easily improvable by skipping even numbers (except 2 ) as candidate divisors, which halves the number of required divisions.此外,通过跳过偶数(除了2 )作为候选除数,可以轻松改进此试验除法,这将所需除法的数量减半。

A sieve of Eratosthenes doesn't need any divisions and uses only O(n * log (log n)) operations, that is quite a bit faster: Eratosthenes 的筛子不需要任何除法,只使用O(n * log (log n))操作,这要快得多:

primeSum.hs : primeSum.hs

module Main (main) where

import System.Environment (getArgs)
import Math.NumberTheory.Primes

main :: IO ()
main = do
    args <- getArgs
    let lim = case args of
                (a:_) -> read a
                _     -> 1000000
    print . sum $ takeWhile (<= lim) primes

And running it for a limit of 10 million:并以 1000 万的限制运行它:

$ ghc -O2 primeSum && time ./primeSum 10000000
[1 of 1] Compiling Main             ( primeSum.hs, primeSum.o )
Linking primeSum ...
3203324994356

real    0m0.085s
user    0m0.084s
sys     0m0.000s

We let the trial division run only to 1 million (fixing the type as Int ):我们让试用部门只运行到 100 万(将类型固定为Int ):

$ ghc -O2 tdprimeSum && time ./tdprimeSum 1000000
[1 of 1] Compiling Main             ( tdprimeSum.hs, tdprimeSum.o )
Linking tdprimeSum ...
37550402023

real    0m0.768s
user    0m0.765s
sys     0m0.002s

And the Turner sieve only to 100000:而特纳筛子只到100000:

$ ghc -O2 tuprimeSum && time ./tuprimeSum 100000
[1 of 1] Compiling Main             ( tuprimeSum.hs, tuprimeSum.o )
Linking tuprimeSum ...
454396537

real    0m2.712s
user    0m2.703s
sys     0m0.005s

(1) The brutal estimate is (1)粗略估计是

2000000
   ∑ √k ≈ 4/3*√2*10^9
 k = 1

evaluated to two significant digits.评估为两位有效数字。 Since most numbers are composites with a small prime factor - half the numbers are even and take only one division - that vastly overestimates the number of divisions required.由于大多数数字是具有小素因数的复合数 - 一半的数字是偶数并且只需要一个除法 - 这大大高估了所需的除法数。

A lower bound for the number of required divisions would be obtained by considering primes only:通过仅考虑素数可以获得所需除法数的下限:

   ∑ √p ≈ 2/3*N^1.5/log N
 p < N
p prime

which, for N = 2000000 gives roughly 1.3*10 8 .其中,对于N = 2000000给出大约 1.3*10 8 That is the right order of magnitude, but underestimates by a nontrivial factor (decreasing slowly to 1 for growing N , and never greater than 2 for N > 10 ).这是正确的数量级,但低估了一个重要的因素(随着N的增长缓慢减少到 1,对于N > 10永远不会大于 2)。

Besides the primes, also the squares of primes and the products of two close primes require the trial division to go up to (nearly) √k and hence contribute significantly to the overall work if there are sufficiently many.除了素数之外,素数的平方和两个相近素数的乘积也需要对 go 进行试除,直到(接近) √k ,因此如果有足够多的数量,则对整体工作有很大贡献。

The number of divisions needed to treat the semiprimes is however bounded by a constant multiple of然而,处理半素数所需的除法数受以下常数倍数的限制

N^1.5/(log N)^2

so for very large N it becomes negligible relative to the cost of treating primes.所以对于非常大的N ,它相对于处理素数的成本变得可以忽略不计。 But in the range where trial division is feasible at all, they still contribute significantly.但在试验划分完全可行的范围内,它们仍然有很大的贡献。

Here's a sieve of Eratosthenes :是埃拉托色尼的筛子

P = { 3 , 5 ,...} \ {{ p 2 , p 2 +2p ,...} | P = { 3 , 5 ,...} \ {{ p 2 , p 2 +2p ,...} | p in P } p中的p }

(without the 2).:) Or in "functional" ie list-based Haskell, (没有 2)。:) 或者在“功能”中,即基于列表的 Haskell,

primes = 2 : g (fix g)  where
   g xs = 3 : (gaps 5 $ unionAll [[p*p, p*p+2*p..] | p <- xs])

unionAll ((x:xs):t) = x : union xs (unionAll $ pairs t)  where
  pairs ((x:xs):ys:t) = (x : union xs ys) : pairs t 
fix g = xs where xs = g xs
union (x:xs) (y:ys) = case compare x y of LT -> x : union  xs (y:ys)
                                          EQ -> x : union  xs    ys 
                                          GT -> y : union (x:xs) ys
gaps k s@(x:xs) | k<x  = k:gaps (k+2) s    
                | True =   gaps (k+2) xs 

Compared with the trial division code in the answer by augustss , it is 1.9x times faster at generating 200k primes, and 2.1x faster at 400k, with empirical time complexity of O(n^1.12..1.15) vs O(n^1.4) , on said range.augustss 答案中的试验除法代码相比,它在生成 200k 素数时快 1.9 倍,在 400k 时快 2.1 倍, 经验时间复杂度O(n^1.12..1.15) vs O(n^1.4) ,在所述范围内。 It is 2.6x times faster at generating 1 mln primes.生成 100 万个素数时速度快 2.6 倍。

Why the Turner sieve is so slow为什么特纳筛子这么慢

Because it opens up multiples-filtering streams for each prime too early , and so ends up with too many of them.因为它过早地为每个素数打开了多重过滤流,所以最终会产生太多的素数。 We don't need to filter by a prime until its square is seen in the input.在输入中看到它的平方之前,我们不需要按素数进行过滤。

Seen under a stream processing paradigm , sieve (x:xs) = x:sieve [y|y<-xs, rem yp/=0] can be seen as creating a pipeline of stream transducers behind itself as it is working:stream 处理范例下看到, sieve (x:xs) = x:sieve [y|y<-xs, rem yp/=0]可以看作是在其工作时在其自身后面创建 stream 换能器管道:

[2..] ==> sieve --> 2
[3..] ==> nomult 2 ==> sieve --> 3
[4..] ==> nomult 2 ==> nomult 3 ==> sieve 
[5..] ==> nomult 2 ==> nomult 3 ==> sieve --> 5
[6..] ==> nomult 2 ==> nomult 3 ==> nomult 5 ==> sieve 
[7..] ==> nomult 2 ==> nomult 3 ==> nomult 5 ==> sieve --> 7
[8..] ==> nomult 2 ==> nomult 3 ==> nomult 5 ==> nomult 7 ==> sieve 

where nomult p = filter (\y->rem yp/=0) .其中nomult p = filter (\y->rem yp/=0) But 8 doesn't need to be checked for divisibility by 3 yet, as it is smaller than 3^2 == 9 , let alone by 5 or 7.但是 8 还不需要检查是否可以被 3 整除,因为它小于3^2 == 9 ,更不用说被 5 或 7 了。

This is the single most serious problem with that code, although it is dismissed as irrelevant right at the start of that article which everybody mention.这是该代码中最严重的问题,尽管在每个人都提到的那篇文章的开头,它被认为是无关紧要的。 Fixing it by postponing the creation of filters achieves dramatic speedups.通过推迟过滤器的创建来修复它可以实现显着的加速。

What you did is not the Sieve of Eratosthenes;你所做的不是埃拉托色尼筛; it's trial division (note the mod operator).它是试用部门(注意 mod 运算符)。 Here's my version of the Sieve of Eratosthenes:这是我的埃拉托色尼筛法:

import Control.Monad (forM_, when)
import Control.Monad.ST
import Data.Array.ST
import Data.Array.Unboxed

sieve :: Int -> UArray Int Bool
sieve n = runSTUArray $ do
    let m = (n-1) `div` 2
        r = floor . sqrt $ fromIntegral n
    bits <- newArray (0, m-1) True
    forM_ [0 .. r `div` 2 - 1] $ \i -> do
        isPrime <- readArray bits i
        when isPrime $ do
            forM_ [2*i*i+6*i+3, 2*i*i+8*i+6 .. (m-1)] $ \j -> do
                writeArray bits j False
    return bits

primes :: Int -> [Int]
primes n = 2 : [2*i+3 | (i, True) <- assocs $ sieve n]

You can run it at http://ideone.com/mu1RN .您可以在http://ideone.com/mu1RN运行它。

Personally, I like this way of generating primes就个人而言,我喜欢这种生成素数的方式

primes :: [Integer]
primes = 2 : filter (isPrime primes) [3,5..]
  where isPrime (p:ps) n = p*p > n || n `rem` p /= 0 && isPrime ps n

It's also quite fast compared to some of the other methods suggested here.与此处建议的其他一些方法相比,它也相当快。 It's still trial division, but it only tests with primes.它仍然是试除法,但它只测试素数。 (The termination proof for this code is slightly tricky, though.) (不过,这段代码的终止证明有点棘手。)

The algorithm you're using is not a sieve at all, so in terms of it being slow you should be expecting that using trial division.您使用的算法根本不是筛子,因此就它的速度而言,您应该期望使用试除法。

Primes are roughly occurring with the frequency of the logarithm function... ie there are ballpark n/log(n) primes between 1 and n .素数大致以对数 function... 的频率出现......即在 1 和n之间有大约n/log(n)素数。 So for the first 2 million primes you are going to need to go up to about 32 million.因此,对于前 200 万个素数,您将需要 go 最多约 3200 万个。 But you are building a 2 million element data structure that those primes are going to have pass through.但是您正在构建一个包含 200 万个元素的数据结构,这些素数将通过这些数据结构。 So you can start to see why this was so slow.所以你可以开始明白为什么这会这么慢。 In fact it is O(n^2).实际上是 O(n^2)。 You can cut it down to O(n*(log n)*log(log n))您可以将其减少到 O(n*(log n)*log(log n))

Here is a page on various treatments that walk you through how to cut that down a bit.这是有关各种治疗方法的页面,将引导您了解如何减少它。 http://en.literateprograms.org/Sieve_of_Eratosthenes_(Haskell) (dead link as of 2022). http://en.literateprograms.org/Sieve_of_Eratosthenes_(Haskell) (截至 2022 年的死链接)。

A simple way to implement the genuine sieve of Eratosthenes in Haskell is to iteratively zipWith over a list of Boolean values signifying whether the corresponding number is prime:在 Haskell 中实现真正的 Eratosthenes 筛子的一种简单方法是在 Boolean 值列表上迭代zipWith ,表示相应的数字是否为素数:

primes :: Int -> [Int]
primes n = [x | (x, prime) <- zip [2..n] $
                  sieve (ceiling . sqrt $ fromIntegral n)
              , prime]
    where 
    sieve 1 = repeat True
    sieve n = zipWith (&&) (filt n) (sieve (n - 1))
    filt n = replicate (n - 1) True
             ++ (tail . cycle $ False :
                  replicate (n - 1) True)

Now, primes n gives the list of the first n primes.现在, primes n给出了前n素数的列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM