在 Haskell 中筛选 Eratosthenes

Question

I'm solving some classic problems in Haskell to develop my functional skills and I have a problem to implement an optimization suggested at this "Programming Praxis" site:我正在解决 Haskell 中的一些经典问题以培养我的功能技能，并且我在实施此 “编程实践”站点上建议的优化时遇到了问题：

I have three solutions to this problem and the third one is too slow compared to the second solution.对于这个问题，我有三个解决方案，与第二个解决方案相比，第三个解决方案太慢了。 Can someone suggest some improvements to my code?有人可以对我的代码提出一些改进建议吗？

My implementations are:我的实现是：

-- primeira implementação
primes n
    | n < 2 = []
    | n == 2 = [2]
    | n `mod` 2 == 0 = primes'
    | otherwise = if (find (\x -> n `mod` x == 0) primes') == Nothing then
                      n:primes'
                  else
                      primes'
    where primes' = primes (n - 1)

-- segunda implementação
primes' :: Integer -> [Integer]
primes' n = sieve $ 2 : [3,5..n]
    where sieve :: [Integer] -> [Integer]
          sieve [] = []
          sieve l@(x:xs)
              | x*x >= n = l
              | otherwise = x : sieve list'
              where list' = filter (\y -> y `mod` x /= 0) xs

-- terceira implementação
primes'' :: Integer -> [Integer]
primes'' n = 2 : sieve 3 [3,5..n]
    where sieve :: Integer -> [Integer] -> [Integer]
          sieve _ [] = []
          sieve m l@(x:xs)
              | m*m >= n = l
              | x < m*m = x : sieve m xs
              | otherwise = sieve (m + 2) list'
              where list'= filter (\y -> y `mod` m /= 0) l

Answer 1

First of all, mod is slow so use rem in situations where it doesn't matter (when you aren't dealing with negatives, basically).首先， mod很慢，所以在无关紧要的情况下使用rem （基本上当你不处理底片时）。 Secondly, use Criterion to show (to yourself) what is faster and what changes are actually optimizations.其次，使用Criterion来展示（给你自己）什么更快，什么变化实际上是优化。 I know I'm not giving a full answer to you question with this, but its a good place for you (and other potential answerers) to start, so here's some code:我知道我不会就此问题给出完整答案，但它是您（和其他潜在回答者）开始的好地方，所以这里有一些代码：

import List
import Criterion.Main

main = do
  str <- getLine
  let run f = length . f
      input = read str :: Integer
  defaultMain   [ bench "primes" (nf (run primes) input)
                , bench "primes'" (nf (run primes') input)
                , bench "primes''" (nf (run primes'') input)
                , bench "primesTMD" (nf (run primesTMD) input)
                , bench "primes'TMD" (nf (run primes'TMD) input)
                , bench "primes''TMD" (nf (run primes''TMD) input)
                ]
  putStrLn . show . length . primes'' $ (read str :: Integer)

-- primeira implementação
primes n
    | n < 2 = []
    | n == 2 = [2]
    | n `mod` 2 == 0 = primes'
    | otherwise = if (find (\x -> n `mod` x == 0) primes') == Nothing then
                      n:primes'
                  else
                      primes'
    where primes' = primes (n - 1)

primesTMD n
    | n < 2 = []
    | n == 2 = [2]
    | n `mod` 2 == 0 = primes'
    | otherwise = if (find (\x -> n `rem` x == 0) primes') == Nothing then
                      n:primes'
                  else
                      primes'
    where primes' = primesTMD (n - 1)

-- segunda implementação
primes' :: Integer -> [Integer]
primes' n = sieve $ 2 : [3,5..n]
    where sieve :: [Integer] -> [Integer]
          sieve [] = []
          sieve l@(x:xs)
              | x*x >= n = l
              | otherwise = x : sieve list'
              where list' = filter (\y -> y `mod` x /= 0) xs

primes'TMD :: Integer -> [Integer]
primes'TMD n = sieve $ 2 : [3,5..n]
    where sieve :: [Integer] -> [Integer]
          sieve [] = []
          sieve l@(x:xs)
              | x*x >= n = l
              | otherwise = x : sieve list'
              where list' = filter (\y -> y `rem` x /= 0) xs

-- terceira implementação
primes'' :: Integer -> [Integer]
primes'' n = 2 : sieve 3 [3,5..n]
    where sieve :: Integer -> [Integer] -> [Integer]
          sieve _ [] = []
          sieve m l@(x:xs)
              | m*m >= n = l
              | x < m*m = x : sieve m xs
              | otherwise = sieve (m + 2) list'
              where list'= filter (\y -> y `mod` m /= 0) l

primes''TMD :: Integer -> [Integer]
primes''TMD n = 2 : sieve 3 [3,5..n]
    where sieve :: Integer -> [Integer] -> [Integer]
          sieve _ [] = []
          sieve m l@(x:xs)
              | m*m >= n = l
              | x < m*m = x : sieve m xs
              | otherwise = sieve (m + 2) list'
              where list'= filter (\y -> y `rem` m /= 0) l

Notice the improved runtime of the variants using rem :请注意使用rem改进的变体运行时间：

 $ ghc --make -O2 sieve.hs
 $./sieve
 5000
 ...
 benchmarking primes 
 mean: 23.88546 ms, lb 23.84035 ms, ub 23.95000 ms

 benchmarking primes'
 mean: 775.9981 us, lb 775.4639 us, ub 776.7081 us

 benchmarking primes''
 mean: 837.7901 us, lb 836.7824 us, ub 839.0260 us

 benchmarking primesTMD
 mean: 16.15421 ms, lb 16.11955 ms, ub 16.19202 ms

 benchmarking primes'TMD
 mean: 568.9857 us, lb 568.5819 us, ub 569.4641 us

 benchmarking primes''TMD
 mean: 642.5665 us, lb 642.0495 us, ub 643.4105 us

While I see you are doing this for your own education, its worth noting the related links of Primes on Haskell.org and the fast Primes package on hackage.虽然我看到你这样做是为了你自己的教育，但值得注意的是Haskell.org上Primes的相关链接和hackage 上的快速Primes 包。

Answer 2

Looks to me like the problem with your third revision is how you choose the next element to sift on.在我看来，第三次修订的问题在于如何选择要筛选的下一个元素。 You indiscriminately increment by 2. The problem is that you then sift on unnecessary numbers.你不加选择地增加了 2。问题是你然后筛选了不必要的数字。 for example, in this version your eventually going to pass 9 as m, and you're going to do an extra recursion to filter on 9, even though it isn't even in the list, and thus you should have never picked it in the first place (since it would have been removed in the very first filter on 3)例如，在这个版本中，您最终将 9 作为 m 传递，并且您将进行额外的递归以过滤 9，即使它甚至不在列表中，因此您不应该选择它第一个位置（因为它会在 3 的第一个过滤器中被删除）

Even though the second version doesn't start the filtering past the square of the number it sifts on, it never chooses an unnecessary sifting value.即使第二个版本没有开始筛选超过它筛选的数字的平方，它也从不选择不必要的筛选值。

In other words, I think you end up sifting on every odd number between 3 and n.换句话说，我认为您最终会筛选 3 和 n 之间的每个奇数。 Instead you should be sifting on every odd number that hasn't already been removed by a previous pass.相反，您应该筛选上一次尚未删除的每个奇数。

I think to correctly implement the optimization of starting the sieve at the square of the current sift value, you have to retain the front of the list while sifting on the back where back contains the elements >= the square of the sift value.我认为要正确实现在当前筛选值的平方开始筛选的优化，您必须保留列表的前面，同时筛选背面包含元素 >= 筛选值的平方。 I think this would force you to use concatenations, and I'm not so sure that the optimization is good enough to cancel out the overhead induced by using ++.我认为这会迫使您使用串联，而且我不太确定优化是否足以抵消使用 ++ 引起的开销。

Answer 3

This is not optimized but expressive implementation: check video Sieve of Eratosthenes in haskell这不是优化但富有表现力的实现：在haskell中检查Eratosthenes的视频

import qualified Data.Set as Set(fromList,difference)
kr n l = (*n) <$> [2..l `div` n]
g n = difference (fromList [2..n]) (fromList $ concat $ ((flip kr) n) <$> [2..n])

在 Haskell 中筛选 Eratosthenes

问题描述

3 个解决方案

解决方案1
6 2010-10-04 04:39:43

解决方案2
6 已采纳 2010-10-04 05:29:29

解决方案3
1 2020-10-18 19:02:23

在 Haskell 中筛选 Eratosthenes

问题描述

3 个解决方案

解决方案1 6 2010-10-04 04:39:43

解决方案2 6 已采纳 2010-10-04 05:29:29

解决方案3 1 2020-10-18 19:02:23

解决方案1
6 2010-10-04 04:39:43

解决方案2
6 已采纳 2010-10-04 05:29:29

解决方案3
1 2020-10-18 19:02:23