[英]Haskell performance when using classes and instances
The Problem 问题
I want to simulate in Haskell a multivalue outputting functions. 我想在Haskell中模拟一个多值输出函数。 The Haskell code is generated (not hand written) - this is important information, see below: 生成Haskell代码(不是手写的) - 这是重要的信息,见下文:
This can be of course easly done by returning a tuple from function, like 这当然可以通过从函数返回元组来轻松完成,比如
f x y = (x+y, x-y)
But when using such function I have to know what kind of tuple it returns: 但是当使用这样的函数时,我必须知道它返回什么样的元组:
...
(out_f_1, out_f_2) = f a b
(out_g_1, out_g_2, out_g_3) = g out_f_1
...
And so on ... But while generating code, I don't know what is the type of ouput of lets say f, so right now I'm using the Data.List.Select
package and simulate the above with: 等等...但是在生成代码时,我不知道f的输出类型是什么,所以现在我正在使用Data.List.Select
包并用以下Data.List.Select
模拟上面的代码:
import Data.List.Select
...
out_f = f a b
out_g = g (sel1 outf)
...
The problem is the performance - on my testing program, the version, which uses Data.List.Select
is twice slower than the version written by hand. 问题是性能 - 在我的测试程序中,使用Data.List.Select
的版本比手工编写的版本慢两倍。
This is very obvious situation, because Data.List.Select
is written using classes
and instances
, so it uses some kind of runtime dictionary (If I'm not wrong). 这是非常明显的情况,因为Data.List.Select
是使用classes
和instances
编写的,所以它使用某种运行时字典(如果我没错)。 ( http://hackage.haskell.org/packages/archive/tuple/0.2.0.1/doc/html/src/Data-Tuple-Select.html#sel1 ) ( http://hackage.haskell.org/packages/archive/tuple/0.2.0.1/doc/html/src/Data-Tuple-Select.html#sel1 )
The Question 问题
I want to ask you If is it possible to somehow compile the version (which uses Data.List.Select
) to be as fast as the manually crafted one? 我想问你是否有可能以某种方式编译版本(使用Data.List.Select
)与手工制作的版本一样快?
I think there should be a switch to compiler, which will tell him to "instantiate" the classes and interfaces for each use (something like templates from C++). 我认为应该转换到编译器,这将告诉他“实例化”每次使用的类和接口(类似于C ++中的模板)。
Benchmarks 基准
Test1.hs: Test1.hs:
import qualified Data.Vector as V
import System.Environment
b :: Int -> Int
b x = x + 5
c x = b x + 1
d x = b x - 1
a x = c x + d x
main = do
putStrLn "Starting..."
args <- getArgs
let iternum = read (head args) :: Int in do
putStrLn $ show $ V.foldl' (+) 0 $ V.map (\i -> a (iternum-i))
$ V.enumFromTo 1 iternum
putStrLn "Done."
compile with ghc -O3 Test1.hs
用ghc -O3 Test1.hs
编译
Test2.hs: Test2.hs:
import qualified Data.Vector as V
import Data.Tuple.Select
import Data.Tuple.OneTuple
import System.Environment
b x = OneTuple $ x + 5
c x = OneTuple $ (sel1 $ b x) + 1
d x = OneTuple $ (sel1 $ b x) - 1
a x = OneTuple $ (sel1 $ c x) + (sel1 $ d x)
main = do
putStrLn "Starting..."
args <- getArgs
let iternum = read (head args) :: Int in do
putStrLn $ show $ V.foldl' (+) 0 $ V.map (\i -> sel1 $ a (iternum-i))
$ V.enumFromTo 1 iternum
putStrLn "Done."
compile with ghc -O3 Test2.hs
用ghc -O3 Test2.hs
编译
Results 结果
time ./Test1 10000000 = 5.54 s
time ./Test2 10000000 = 10.06 s
Ok, the results I've posted are not accurate - as @sabauma told - the two codes perform in the same time If you compile them with optimizations enabled. 好吧,我发布的结果并不准确 - 正如@sabauma所说 - 两个代码同时执行如果你在启用优化的情况下编译它们。
The @tohava's answer is very good if you want to explicity show which functions to specialize (see the @sabauma comment above). @ tohava的回答非常好,如果你想明确表示要专门化的功能(参见上面的@sabauma评论)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.