[英]How can I speed up my Haskell program (to the level of Python)
I have the following toy program which cyclic shifts a vector and adds it to itself (under a mod).我有以下玩具程序,它循环移位向量并将其添加到自身(在 mod 下)。 It does that for different shifts and high number of iterations (compared to the size of the vector).
它针对不同的班次和大量迭代(与向量的大小相比)执行此操作。 Program works, but its dog slow.
程序有效,但它的狗慢。 I am still learning Haskell, so my question is: am I doing something wrong?
我还在学习 Haskell,所以我的问题是:我做错了吗?
import Data.List (foldl')
import qualified Data.Sequence as Seq
import Data.Sequence (index, zipWith, Seq, (><), (<|), (|>))
seqSize = 100
numShifts = 10000
cycleShift :: Integer -> Seq a -> Seq a
cycleShift s l = Seq.drop (fromInteger s) l >< Seq.take (fromInteger s) l
modAdd :: Seq Integer -> Seq Integer -> Seq Integer
modAdd s t = Seq.zipWith (\ a b -> (a + b) `mod` 10^16) s t
step :: Seq Integer -> Integer -> Seq Integer
step l shift = modAdd l (cycleShift shift l)
allshifts = [i `mod` seqSize |i <- [1..numShifts]]
start = Seq.fromList (1 : [0 | i <- [1..(seqSize - 1)]])
end = foldl' step start allshifts
main :: IO ()
main = print (Seq.index end 0)
The same program in Python Python中的相同程序
seq_size = 100
num_shifts = 10000
S = [i % seq_size for i in xrange(1, num_shifts + 1)]
ssums = [1] + [0 for i in range(seq_size - 1)]
for s in S:
shift = ssums[s:] + ssums[:s]
ssums = [(ssums[i] + shift[i]) % 10**16 for i in range(seq_size)]
print ssums[0]
Here are the timings.以下是时间安排。 Haskell: real 0m5.596s Python: real 0m0.551s
Haskell:实际0m5.596s Python:实际0m0.551s
Python is not known for it's speed and yet is x10 times faster??? Python 的速度不为人知,但速度却快了 10 倍???
How are you running it?你是如何运行它的?
I get 1.6 seconds for the Haskell version.对于 Haskell 版本,我得到 1.6 秒。 (Compiled with
ghc.exe -O2 seq.hs
.) (用
ghc.exe -O2 seq.hs
编译。)
Also, is there a reason you're using Seq?另外,您使用 Seq 是否有原因? If I change it to use lists, I get 0.3 seconds execution time.
如果我将其更改为使用列表,我将获得 0.3 秒的执行时间。
Here it is with lists:这是列表:
import Data.List (foldl')
seqSize = 100
numShifts = 10000
cycleShift s l = drop (fromInteger s) l ++ take (fromInteger s) l
modAdd s t = zipWith (\ a b -> (a + b) `mod` 10^16) s t
step l shift = modAdd l (cycleShift shift l)
allshifts = [i `mod` seqSize |i <- [1..numShifts]]
start = (1 : [0 | i <- [1..(seqSize - 1)]])
end = foldl' step start allshifts
main :: IO ()
main = print (end !! 0)
Data.Vector
is even faster.Data.Vector
甚至更快。rem
instead of mod
rem
而不是mod
cycleShift
. Before, you splitted the list twice) cycleShift
。之前,您将列表拆分了两次)Int
instead of Integer
if your calculation may not exceed the bounds.Int
而不是Integer
。 The former is a hardware int, while the later is arbitrary precision, but emulated via software. Result: 3.6 secs to 0.5 secs.结果:3.6 秒到 0.5 秒。 More is probably possible.
更多可能是可能的。
Code:代码:
import Data.List (foldl')
import Data.Tuple
seqSize, numShifts :: Int
seqSize = 100
numShifts = 10000
cycleShift :: Int -> [a] -> [a]
cycleShift s = uncurry (++) . swap . splitAt s
modAdd :: [Int] -> [Int] -> [Int]
modAdd = zipWith (\ a b -> (a + b) `rem` 10^16)
step :: [Int] -> Int -> [Int]
step l shift = modAdd l (cycleShift shift l)
allshifts = map (`rem` seqSize) [1..numShifts]
start = 1 : replicate (seqSize - 1) 0
end = foldl' step start allshifts
main :: IO ()
main = print (head end)
It gets even faster by using Data.Vector
.使用
Data.Vector
会变得更快。 I get around 0.4 sec on my machine using this code:我使用以下代码在我的机器上得到大约 0.4 秒:
import Data.List (foldl')
import Data.Tuple
import Data.Vector (Vector)
import qualified Data.Vector as V
seqSize, numShifts :: Int
seqSize = 100
numShifts = 10000
cycleShift :: Int -> Vector a -> Vector a
cycleShift s = uncurry (V.++) . swap . V.splitAt s
modAdd :: Vector Int -> Vector Int -> Vector Int
modAdd = V.zipWith (\ a b -> (a + b) `rem` 10^16)
step :: Vector Int -> Int -> Vector Int
step l shift = modAdd l (cycleShift shift l)
allshifts = map (`rem` seqSize) [1..numShifts]
start = 1 `V.cons` V.replicate (seqSize - 1) 0
end = foldl' step start allshifts
main :: IO ()
main = print (V.head end)
Using Data.Vector.Unboxed
(Just change the imports and fix up the signatures), the runtime drops down to 0.074 secs.使用
Data.Vector.Unboxed
(只需更改导入并修复签名),运行时间下降到 0.074 秒。 But the results are only correct, if an Int
has 64 bit.但结果只有在
Int
有 64 位时才是正确的。 It may also be that fast using Int64
though.不过,使用
Int64
也可能会那么快。
Ensure the Haskell code is compiled and the resulting executable is being timed, not the interpreted version of the code.确保 Haskell 代码已编译并且生成的可执行文件正在计时,而不是代码的解释版本。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.