简体   繁体   English

Haskell模块优化

[英]Haskell module optimization

I have a problem with Haskell module optimization. 我有Haskell模块优化的问题。

There is Main module. 有主模块。

{-# LANGUAGE OverloadedStrings #-}
module Main where

import Control.DeepSeq
import Formatting
import Formatting.Clock
import System.Clock

import Data.Array

size :: Int
size = 200 :: Int

stdMult     :: (Ix a, Ix b, Ix c, Num d) =>
               Array (a,b) d -> Array (b,c) d -> Array (a,c) d
stdMult x y =  array resultBounds
                 [((i,j), sum [ x!(i,k) * y!(k,j) | k <- range (lj,uj)])
                                   | i <- range (li,ui),
                                     j <- range (lj',uj') ]
    where ((li,lj),(ui,uj))     = bounds x
          ((li',lj'),(ui',uj')) = bounds y
          resultBounds
            | (lj,uj)==(li',ui') = ((li,lj'),(ui,uj'))
            | otherwise = error "error"


main :: IO ()
main = do
  let a = array ((1,1),(size, size)) [((i,j), 2*i-j) |
                                  i <- range (1,size),
                                  j <- range (1,size)]
  let b = array ((1,1),(size, size)) [((i,j), 2*i+3*j) |
                                  i <- range (1,size)`,
                                  j <- range (1,size)]

  start <- getTime ProcessCPUTime
  let
    c = stdMult a b
  end <- c `deepseq` getTime ProcessCPUTime
  fprint (timeSpecs % "\n") start end
  return()

When stdMult in Main module, everything works ok. 在主模块中的stdMult时,一切正常。 I replace stdMult to another module. 我将stdMult替换为另一个模块。 When I don't use ghc optimization, execution time is the same. 当我不使用ghc优化时,执行时间是相同的。 When I use ghc options -O3, when stdMult in Main module time execution decreases, but when stdMult in another module, execution time is almost unchanged! 当我使用ghc选项-O3时,当主模块中的stdMult时间执行减少时,但当stdMult在另一个模块中时,执行时间几乎不变! For example, when stdMult in Main I have time 3 seconds, and when stdMult not in Main I have time 30 seconds, for matrix 500x500. 例如,当Main中的stdMult有时间3秒,而当stdMult不在Main时我有30秒的时间,对于矩阵500x500。

It is very strange! 这很奇怪!

(You need the clock and formatting packages from Hackage to compile the code.) (您需要Hackage中的clockformatting包来编译代码。)

I can reproduce the 10x slowdown when stdMult is in a different module. stdMult在不同的模块中时,我可以重现10倍减速。 Luckily a fix is easy: in the module where stdMult is defined, add an INLINABLE pragma: 幸运的是一个解决方法是简单的:其中,在模块中stdMult被定义,添加一个INLINABLE的pragma:

{-# INLINABLE stdMult #-}

It adds the definition to the interface file ( .hi ) which allows inlining in the modules that uses it, which in turn allows it to be specialized to fast machine Int instead of slow abstract Ix and Num polymorphic code. 它将定义添加到接口文件( .hi )中,允许在使用它的模块中进行内联,这反过来允许它专门用于快速加工Int而不是慢抽象IxNum多态代码。 (If it's in the same module GHC can inline and specialize at will, and things aren't INLINABLE by default because it can cause executable code bloat and slower compilation.) (如果它在同一个模块中,GHC可以随意内联和专门化,并且默认情况下并不是INLINABLE ,因为它可能导致可执行代码膨胀和编译速度变慢。)

Alternatively to INLINABLE , you can manually SPECIALIZE to the types you want optimized implementations for. 作为INLINABLE替代INLINABLE ,您可以手动SPECIALIZE到您希望优化实现的类型。 This is a bit more verbose, but should be faster to compile in big projects (it will be specialized once per export, instead of once per import, at a rough guess). 这有点冗长,但在大型项目中编译应该更快(每次导出一次,而不是每次导入一次,粗略猜测)。

{-# SPECIALIZE stdMult :: Array (Int, Int) Int -> Array (Int, Int) Int -> Array (Int, Int) Int #-}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM