简体   繁体   English

为什么这个程序的F#版本比Haskell快6倍?

[英]Why is the F# version of this program 6x faster than the Haskell one?

Haskell version(1.03s): Haskell版本(1.03s):

module Main where
  import qualified Data.Text as T
  import qualified Data.Text.IO as TIO
  import Control.Monad
  import Control.Applicative ((<$>))
  import Data.Vector.Unboxed (Vector,(!))
  import qualified Data.Vector.Unboxed as V

  solve :: Vector Int -> Int
  solve ar =
    V.foldl' go 0 ar' where
      ar' = V.zip ar (V.postscanr' max 0 ar)
      go sr (p,m) = sr + m - p

  main = do
    t <- fmap (read . T.unpack) TIO.getLine -- With Data.Text, the example finishes 15% faster.
    T.unlines . map (T.pack . show . solve . V.fromList . map (read . T.unpack) . T.words)
      <$> replicateM t (TIO.getLine >> TIO.getLine) >>= TIO.putStr

F# version(0.17s): F#版本(0.17s):

open System

let solve (ar : uint64[]) =
    let ar' = 
        let t = Array.scanBack max ar 0UL |> fun x -> Array.take (x.Length-1) x
        Array.zip ar t

    let go sr (p,m) = sr + m - p
    Array.fold go 0UL ar'

let getIntLine() =
    Console.In.ReadLine().Split [|' '|]
    |> Array.choose (fun x -> if x <> "" then uint64 x |> Some else None)    

let getInt() = getIntLine().[0]

let t = getInt()
for i=1 to int t do
    getInt() |> ignore
    let ar = getIntLine()
    printfn "%i" (solve ar)

The above two programs are the solutions for the Stock Maximize problem and times are for the first test case of the Run Code button. 以上两个程序是Stock Mostize问题的解决方案,时间是Run Code按钮的第一个测试用例。

For some reason the F# version is roughly 6x faster, but I am pretty sure that if I replaced the slow library functions with imperative loops that I could speed it up by at least 3 times and more likely 10x. 由于某种原因,F#版本大约快6倍,但我很确定如果我用强制循环替换慢库函数,我可以将其加速至少3倍,更可能是10倍。

Could the Haskell version be similarly improved? Haskell版本可以同样改进吗?

I am doing the above for learning purposes and in general I am finding it difficult to figure out how to write efficient Haskell code. 我正在做以上的学习目的,一般来说,我发现很难弄清楚如何编写有效的Haskell代码。

If you switch to ByteString and stick with plain Haskell lists (instead of vectors) you will get a more efficient solution. 如果切换到ByteString并坚持使用普通的Haskell列表(而不是向量),您将获得更有效的解决方案。 You may also rewrite the solve function with a single left fold and bypass zip and right scan (1) . 您也可以使用单个左折叠和旁路拉链和右扫描重写求解功能(1) Overall, on my machine, I get 20 times performance improvement compared to your Haskell solution (2) . 总的来说,在我的机器上,与Haskell解决方案(2)相比,性能提高了20倍。

Below Haskell code performs faster than the F# code: Haskell下面的代码执行速度比F#代码快:

import Data.List (unfoldr)
import Control.Applicative ((<$>))
import Control.Monad (replicateM_)
import Data.ByteString (ByteString)
import qualified Data.ByteString as B
import qualified Data.ByteString.Char8 as C

parse :: ByteString -> [Int]
parse = unfoldr $ C.readInt . C.dropWhile (== ' ')

solve :: [Int] -> Int
solve xs = foldl go (const 0) xs minBound
    where go f x s = if s < x then f x else s - x + f s

main = do
    [n] <- parse <$> B.getLine
    replicateM_ n $ B.getLine >> B.getLine >>= print . solve . parse

1. See edits for an earlier version of this answer which implements solve using zip and scanr . 1.有关使用zipscanr实现solve的此答案的早期版本,请参阅编辑
2. HackerRank website shows even a larger performance improvement. 2. HackerRank网站显示了更大的性能提升。

If I wanted to do that quickly in F# I would avoid all of the higher-order functions inside solve and just write a C-style imperative loop: 如果我想在F#中快速完成,我会避免solve内部的所有高阶函数,并编写一个C风格的命令循环:

let solve (ar : uint64[]) =
  let mutable sr, m = 0UL, 0UL
  for i in ar.Length-1 .. -1 .. 0 do
    let p = ar.[i]
    m <- max p m
    sr <- sr + m - p
  sr

According to my measurements, this is 11x faster than your F#. 根据我的测量结果,这比你的F#快11倍。

Then the performance is limited by the IO layer (unicode parsing) and string splitting. 然后性能受到IO层(unicode解析)和字符串拆分的限制。 This can be optimised by reading into a byte buffer and writing the lexer by hand: 这可以通过读入字节缓冲区并手动编写词法分析器来优化:

let buf = Array.create 65536 0uy
let mutable idx = 0
let mutable length = 0

do
  use stream = System.Console.OpenStandardInput()
  let rec read m =
    let c =
      if idx < length then
        idx <- idx + 1
      else
        length <- stream.Read(buf, 0, buf.Length)
        idx <- 1
      buf.[idx-1]
    if length > 0 && '0'B <= c && c <= '9'B then
      read (10UL * m + uint64(c - '0'B))
    else
      m
  let read() = read 0UL
  for _ in 1UL .. read() do
    Array.init (read() |> int) (fun _ -> read())
    |> solve
    |> System.Console.WriteLine

Just for the record, the F# version is also not optimal. 仅供记录,F#版本也不是最佳选择。 I don't think it really matters at this point, but if people wanted to compare the performance, then it is worth noting that it can be made faster. 我认为此时并不重要,但如果人们想要比较性能,那么值得注意的是它可以更快。

I have not tried very hard (you can certainly make it even faster by using restricted mutation, which would not be against the nature of F#), but simple change to use Seq instead of Array in the right places (to avoid allocating temporary arrays) makes the code about 2x to 3x faster: 我没有非常努力(你当然可以通过使用受限制的突变来更快地实现它,这不会违反F#的性质),但是在正确的位置使用Seq而不是Array简单更改(以避免分配临时数组)使代码速度提高约2倍至3倍:

let solve (ar : uint64[]) =
    let ar' = Seq.zip ar (Array.scanBack max ar 0UL)    
    let go sr (p,m) = sr + m - p
    Seq.fold go 0UL ar'

If you use Seq.zip , you can also drop the take call (because Seq.zip truncates the sequence automatically). 如果您使用Seq.zip ,您也可以删除take调用(因为Seq.zip自动截断序列)。 Measured using #time using the following snippet: 使用以下代码段使用#time测量:

let rnd = Random()
let inp = Array.init 100000 (fun _ -> uint64 (rnd.Next()))
for a in 0 .. 10 do ignore (solve inp) // Measure this line

I get around 150ms for the original code and something between 50-75ms using the new version. 我使用新版本可以获得大约150ms的原始代码和50-75ms之间的内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM