简体   繁体   中英

Haskell(GHC) specialization tour & efficient TypeFamilies

Haskell is all about abstraction. But abstraction costs us extra CPU cycles and extra memory usage due to common representation of all abstract (polymorphic) data - pointer on heap. There are some ways to make abstract code play better with high performance demands. As far as I understand, one way it is done is specialization - basically extra code generation(manual or by compiler), correct ?

Let's assume that all code below is Strict (which helps compiler perform more optimizations ?)

If we have a function sum :

sum :: (Num a) => a -> a -> a

We can generate specialized version of it using specialize pragma:

{-#SPECIALIZE sum :: Float -> Float -> Float#-}

Now if haskell compiler can determine at compile time that we call sum on two Float s, it is going to use specialized version of it. No heap allocations, right ?

Functions - done. Same pragma can be applied to class instances. Logic does not change here, does it ?

But what about data types ? I suspect that TypeFamilies are in charge here ?

Let's try to specialize dependent length-indexed list.

--UVec for unboxed vector
class UVec a where
   data Vec (n :: Nat) a :: *

instance UVec Float where
   data Vec n Float where
     VNilFloat :: Vec 0 Float
     VConsFloat :: {-#UNPACK#-}Float ->
                   Vec n Float -> 
                   Vec (N :+ 1) Float

But Vec has a problem. We can't pattern match on its constructors as each instance of UVec does not have to provide Vec with identical constructors. This forces us to implement each function on Vec for each instance of Vec (as lack of pattern matching implies that it can't be polymorphic on Vec ). What is the best practice in such case ?

As you say, we can't pattern match on UVec a without knowing what a is. One option is to use another typeclass that extends your vector class with a custom function.

class UVec a => UVecSum a where
   sum :: UVec a -> a

instance UVecSum Float where
   sum = ... -- use pattern match here

If, later on, we use sum v where v :: UVec Float , the Float -specific code we defined in the instance will be called.

Partial answer, but perhaps it might help.

As far as I understand, one way it is done is specialization - basically extra code generation(manual or by compiler), correct ?

Yes, this is similar to code instantiation in C++ templates.

Now if haskell compiler can determine at compile time that we call sum on two Floats, it is going to use specialized version of it. No heap allocations, right ?

Yes the compiler calls the specialised version whenever possible. Not sure what you mean regarding the heap allocations.

Regarding the dependently types vectors: usually (I know this from Idris) the length of the vector is eliminated by the compiler when possible. It is intended for stronger type checking. At runtime the length information is useless and can be dropped.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM