GHC優化：Collatz猜想

Question

我在Haskell和C ++ （ideone鏈接）中編寫了Project Euler挑戰14的代碼。 他們都記得以前在數組中做過的任何計算。

分別使用ghc -O2和g++ -O3 ，C ++運行速度比Haskell版本快10-15倍。

雖然我理解Haskell版本可能運行速度較慢，並且Haskell是一種更好的語言，但我很高興知道我可以對Haskell版本進行一些代碼更改以使其運行得更快（理想情況下在2或2之內） 3個C ++版本）？

Haskell代碼在這里：

import Data.Array
import Data.Word
import Data.List

collatz_array = 
  let
    upperbound = 1000000
    a = array (1, upperbound) [(i :: Word64, f i :: Int) | i <- [1..upperbound]]
    f i = i `seq`
      let
        check_f i = i `seq` if i <= upperbound then a ! i else f i
      in
        if (i == 1) then 0 else (check_f ((if (even i) then i else 3 * i + 1) `div` 2)) + 1
  in a

main = 
  putStrLn $ show $ 
   foldl1' (\(x1,x2) (y1,y2) -> if (x2 >= y2) then (x1, x2) else (y1, y2)) $! (assocs collatz_array)

編輯：

我現在也使用未裝箱的可變數組完成了一個版本。 它仍然比C ++版本慢5倍，但是有了顯着的改進。 代碼在這里是ideone。

我想知道對可變陣列版本的改進，使其更接近C ++版本。

Answer 1

您的（可變數組）代碼的一些問題：

您使用折疊來查找最大鏈長，因為該數組必須轉換為關聯列表，這需要時間和C ++版本不需要的分配。
你使用even和div來測試resp除以2.這些都很慢。 g ++優化了兩個操作到更快的位操作（至少在更快的平台上），但GHC不進行這些低級優化（目前），所以目前它們必須手工完成。
您使用readArray和writeArray 。 在處理其他問題時，在C ++代碼中未執行的額外邊界檢查也需要時間，這相當於運行時間的很大一部分（我的盒子上大約25％），因為已經完成算法中有很多讀寫操作。

將其納入實施，我得到了

import Data.Array.ST
import Data.Array.Base
import Control.Monad.ST
import Data.Bits

collatz_array :: ST s (STUArray s Int Int)
collatz_array = do
    let upper = 10000000
    arr <- newArray (0,upper) 0
    unsafeWrite arr 2 1
    let check i
            | upper < i = return arr
            | i .&. 1 == 0 = do
                l <- unsafeRead arr (i `shiftR` 1)
                unsafeWrite arr i (l+1)
                check (i+1)
            | otherwise = do
                let j = (3*i+1) `shiftR` 1
                    find k l
                        | upper < k = find (next k) $! l+1
                        | k < i     = do
                            m <- unsafeRead arr k
                            return (m+l)
                        | otherwise = do
                            m <- unsafeRead arr k
                            if m == 0
                              then do
                                  n <- find (next k) 1
                                  unsafeWrite arr k n
                                  return (n+l)
                              else return (m+l)
                          where
                            next h
                                | h .&. 1 == 0 = h `shiftR` 1
                                | otherwise = (3*h+1) `shiftR` 1
                l <- find j 1
                unsafeWrite arr i l
                check (i+1)
    check 3

collatz_max :: ST s (Int,Int)
collatz_max = do
    car <- collatz_array
    (_,upper) <- getBounds car
    let find w m i
            | upper < i = return (w,m)
            | otherwise = do
                l <- unsafeRead car i
                if m < l
                  then find i l (i+1)
                  else find w m (i+1)
    find 1 0 2

main :: IO ()
main = print (runST collatz_max)

和時間（均為1000萬）：

$ time ./cccoll
8400511 429

real    0m0.210s
user    0m0.200s
sys     0m0.009s
$ time ./stcoll
(8400511,429)

real    0m0.341s
user    0m0.307s
sys     0m0.033s

這看起來不太糟糕。

重要說明：該代碼僅適用於64位GHC（因此，特別是在Windows上，您需要ghc-7.6.1或更高版本，之前的GHC在64位Windows上甚至是32位），因為中間鏈元素超過32比特范圍。在32位系統上，必須使用Integer或64位整數類型（ Int64或Word64 ）來跟蹤鏈，從而以極大的性能成本，因為原始的64位操作（算術和移位）被實現為外部調用32位GHC中的C函數（快速外部調用，但仍然比直接機器操作慢得多）。

Answer 2

ideone網站正在使用ghc 6.8.2，這已經很老了。 在ghc版本7.4.1上，差異要小得多。

用ghc：

$ ghc -O2 euler14.hs && time ./euler14
(837799,329)
./euler14  0.63s user 0.04s system 98% cpu 0.685 total

使用g ++ 4.7.0：

$ g++ --std=c++0x -O3 euler14.cpp && time ./a.out
8400511 429
./a.out  0.24s user 0.01s system 99% cpu 0.252 total

對我來說，ghc版本只比c ++版本慢2.7倍。 此外，這兩個程序沒有給出相同的結果......（不是一個好的跡象，特別是對於基准測試）

GHC優化：Collatz猜想

問題描述

2 個解決方案

解決方案1
4 已采納 2012-06-04 10:57:49

解決方案2
2 2012-06-04 07:03:13

GHC優化：Collat​​z猜想

問題描述

2 個解決方案

解決方案1 4 已采納 2012-06-04 10:57:49

解決方案2 2 2012-06-04 07:03:13

GHC優化：Collatz猜想

解決方案1
4 已采納 2012-06-04 10:57:49

解決方案2
2 2012-06-04 07:03:13