[英]Why is the F# version of this program 6x faster than the Haskell one?
Haskell version(1.03s): Haskell版本(1.03s):
module Main where
import qualified Data.Text as T
import qualified Data.Text.IO as TIO
import Control.Monad
import Control.Applicative ((<$>))
import Data.Vector.Unboxed (Vector,(!))
import qualified Data.Vector.Unboxed as V
solve :: Vector Int -> Int
solve ar =
V.foldl' go 0 ar' where
ar' = V.zip ar (V.postscanr' max 0 ar)
go sr (p,m) = sr + m - p
main = do
t <- fmap (read . T.unpack) TIO.getLine -- With Data.Text, the example finishes 15% faster.
T.unlines . map (T.pack . show . solve . V.fromList . map (read . T.unpack) . T.words)
<$> replicateM t (TIO.getLine >> TIO.getLine) >>= TIO.putStr
F# version(0.17s): F#版本(0.17s):
open System
let solve (ar : uint64[]) =
let ar' =
let t = Array.scanBack max ar 0UL |> fun x -> Array.take (x.Length-1) x
Array.zip ar t
let go sr (p,m) = sr + m - p
Array.fold go 0UL ar'
let getIntLine() =
Console.In.ReadLine().Split [|' '|]
|> Array.choose (fun x -> if x <> "" then uint64 x |> Some else None)
let getInt() = getIntLine().[0]
let t = getInt()
for i=1 to int t do
getInt() |> ignore
let ar = getIntLine()
printfn "%i" (solve ar)
The above two programs are the solutions for the Stock Maximize problem and times are for the first test case of the Run Code
button. 以上两个程序是Stock Mostize问题的解决方案,时间是Run Code
按钮的第一个测试用例。
For some reason the F# version is roughly 6x faster, but I am pretty sure that if I replaced the slow library functions with imperative loops that I could speed it up by at least 3 times and more likely 10x. 由于某种原因,F#版本大约快6倍,但我很确定如果我用强制循环替换慢库函数,我可以将其加速至少3倍,更可能是10倍。
Could the Haskell version be similarly improved? Haskell版本可以同样改进吗?
I am doing the above for learning purposes and in general I am finding it difficult to figure out how to write efficient Haskell code. 我正在做以上的学习目的,一般来说,我发现很难弄清楚如何编写有效的Haskell代码。
If you switch to ByteString
and stick with plain Haskell lists (instead of vectors) you will get a more efficient solution. 如果切换到ByteString
并坚持使用普通的Haskell列表(而不是向量),您将获得更有效的解决方案。 You may also rewrite the solve function with a single left fold and bypass zip and right scan (1) . 您也可以使用单个左折叠和旁路拉链和右扫描重写求解功能(1) 。 Overall, on my machine, I get 20 times performance improvement compared to your Haskell solution (2) . 总的来说,在我的机器上,与Haskell解决方案(2)相比,性能提高了20倍。
Below Haskell code performs faster than the F# code: Haskell下面的代码执行速度比F#代码快:
import Data.List (unfoldr)
import Control.Applicative ((<$>))
import Control.Monad (replicateM_)
import Data.ByteString (ByteString)
import qualified Data.ByteString as B
import qualified Data.ByteString.Char8 as C
parse :: ByteString -> [Int]
parse = unfoldr $ C.readInt . C.dropWhile (== ' ')
solve :: [Int] -> Int
solve xs = foldl go (const 0) xs minBound
where go f x s = if s < x then f x else s - x + f s
main = do
[n] <- parse <$> B.getLine
replicateM_ n $ B.getLine >> B.getLine >>= print . solve . parse
1. See edits for an earlier version of this answer which implements solve
using zip
and scanr
. 1.有关使用zip
和scanr
实现solve
的此答案的早期版本,请参阅编辑 。
2. HackerRank website shows even a larger performance improvement. 2. HackerRank网站显示了更大的性能提升。
If I wanted to do that quickly in F# I would avoid all of the higher-order functions inside solve
and just write a C-style imperative loop: 如果我想在F#中快速完成,我会避免solve
内部的所有高阶函数,并编写一个C风格的命令循环:
let solve (ar : uint64[]) =
let mutable sr, m = 0UL, 0UL
for i in ar.Length-1 .. -1 .. 0 do
let p = ar.[i]
m <- max p m
sr <- sr + m - p
sr
According to my measurements, this is 11x faster than your F#. 根据我的测量结果,这比你的F#快11倍。
Then the performance is limited by the IO layer (unicode parsing) and string splitting. 然后性能受到IO层(unicode解析)和字符串拆分的限制。 This can be optimised by reading into a byte buffer and writing the lexer by hand: 这可以通过读入字节缓冲区并手动编写词法分析器来优化:
let buf = Array.create 65536 0uy
let mutable idx = 0
let mutable length = 0
do
use stream = System.Console.OpenStandardInput()
let rec read m =
let c =
if idx < length then
idx <- idx + 1
else
length <- stream.Read(buf, 0, buf.Length)
idx <- 1
buf.[idx-1]
if length > 0 && '0'B <= c && c <= '9'B then
read (10UL * m + uint64(c - '0'B))
else
m
let read() = read 0UL
for _ in 1UL .. read() do
Array.init (read() |> int) (fun _ -> read())
|> solve
|> System.Console.WriteLine
Just for the record, the F# version is also not optimal. 仅供记录,F#版本也不是最佳选择。 I don't think it really matters at this point, but if people wanted to compare the performance, then it is worth noting that it can be made faster. 我认为此时并不重要,但如果人们想要比较性能,那么值得注意的是它可以更快。
I have not tried very hard (you can certainly make it even faster by using restricted mutation, which would not be against the nature of F#), but simple change to use Seq
instead of Array
in the right places (to avoid allocating temporary arrays) makes the code about 2x to 3x faster: 我没有非常努力(你当然可以通过使用受限制的突变来更快地实现它,这不会违反F#的性质),但是在正确的位置使用Seq
而不是Array
简单更改(以避免分配临时数组)使代码速度提高约2倍至3倍:
let solve (ar : uint64[]) =
let ar' = Seq.zip ar (Array.scanBack max ar 0UL)
let go sr (p,m) = sr + m - p
Seq.fold go 0UL ar'
If you use Seq.zip
, you can also drop the take
call (because Seq.zip
truncates the sequence automatically). 如果您使用Seq.zip
,您也可以删除take
调用(因为Seq.zip
自动截断序列)。 Measured using #time
using the following snippet: 使用以下代码段使用#time
测量:
let rnd = Random()
let inp = Array.init 100000 (fun _ -> uint64 (rnd.Next()))
for a in 0 .. 10 do ignore (solve inp) // Measure this line
I get around 150ms for the original code and something between 50-75ms using the new version. 我使用新版本可以获得大约150ms的原始代码和50-75ms之间的内容。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.