简体   繁体   English

Haskell(GHC)专业之旅和高效的TypeFamilies

[英]Haskell(GHC) specialization tour & efficient TypeFamilies

Haskell is all about abstraction. Haskell完全是关于抽象的。 But abstraction costs us extra CPU cycles and extra memory usage due to common representation of all abstract (polymorphic) data - pointer on heap. 但是由于所有抽象(多态)数据的通用表示(堆上的指针),抽象使我们付出了额外的CPU周期和额外的内存使用。 There are some ways to make abstract code play better with high performance demands. 有一些方法可以使抽象代码在高性能要求下更好地发挥作用。 As far as I understand, one way it is done is specialization - basically extra code generation(manual or by compiler), correct ? 据我了解,一种完成方法是专业化-基本上是额外的代码生成(手动或通过编译器),对吗?

Let's assume that all code below is Strict (which helps compiler perform more optimizations ?) 假设下面的所有代码都是严格的 (这有助于编译器执行更多优化?)

If we have a function sum : 如果我们有一个函数sum

sum :: (Num a) => a -> a -> a

We can generate specialized version of it using specialize pragma: 我们可以使用specialize pragma生成它的专门版本:

{-#SPECIALIZE sum :: Float -> Float -> Float#-}

Now if haskell compiler can determine at compile time that we call sum on two Float s, it is going to use specialized version of it. 现在,如果haskell编译器可以确定在编译时我们在两个Float上调用sum ,它将使用它的专用版本。 No heap allocations, right ? 没有堆分配,对吗?

Functions - done. 功能-完成。 Same pragma can be applied to class instances. 相同的编译指示可以应用于类实例。 Logic does not change here, does it ? 逻辑在这里不会改变,对吗?

But what about data types ? 但是数据类型呢? I suspect that TypeFamilies are in charge here ? 我怀疑TypeFamilies在这里负责吗?

Let's try to specialize dependent length-indexed list. 让我们尝试专门化依赖长度索引的列表。

--UVec for unboxed vector
class UVec a where
   data Vec (n :: Nat) a :: *

instance UVec Float where
   data Vec n Float where
     VNilFloat :: Vec 0 Float
     VConsFloat :: {-#UNPACK#-}Float ->
                   Vec n Float -> 
                   Vec (N :+ 1) Float

But Vec has a problem. 但是Vec有一个问题。 We can't pattern match on its constructors as each instance of UVec does not have to provide Vec with identical constructors. 我们无法在其构造函数上进行模式匹配,因为每个UVec实例不必为Vec提供相同的构造函数。 This forces us to implement each function on Vec for each instance of Vec (as lack of pattern matching implies that it can't be polymorphic on Vec ). 这迫使我们为Vec的每个实例在Vec上实现每个函数(由于缺少模式匹配,这意味着它不能在Vec上实现多态)。 What is the best practice in such case ? 在这种情况下,最佳做法是什么?

As you say, we can't pattern match on UVec a without knowing what a is. 正如您所说,我们无法在UVec a进行模式匹配,而无需知道a是什么。 One option is to use another typeclass that extends your vector class with a custom function. 一种选择是使用另一种类型类,该类通过自定义函数扩展向量类。

class UVec a => UVecSum a where
   sum :: UVec a -> a

instance UVecSum Float where
   sum = ... -- use pattern match here

If, later on, we use sum v where v :: UVec Float , the Float -specific code we defined in the instance will be called. 如果稍后使用sum v其中v :: UVec Float ,则将调用实例中定义的特定于Float代码。

Partial answer, but perhaps it might help. 部分回答,但也许有帮助。

As far as I understand, one way it is done is specialization - basically extra code generation(manual or by compiler), correct ? 据我了解,一种完成方法是专业化-基本上是额外的代码生成(手动或通过编译器),对吗?

Yes, this is similar to code instantiation in C++ templates. 是的,这类似于C ++模板中的代码实例化。

Now if haskell compiler can determine at compile time that we call sum on two Floats, it is going to use specialized version of it. 现在,如果haskell编译器可以确定在编译时我们在两个Float上调用sum,它将使用它的专用版本。 No heap allocations, right ? 没有堆分配,对吗?

Yes the compiler calls the specialised version whenever possible. 是的,编译器会在可能的情况下调用专用版本。 Not sure what you mean regarding the heap allocations. 不确定您对堆分配的含义。

Regarding the dependently types vectors: usually (I know this from Idris) the length of the vector is eliminated by the compiler when possible. 关于向量的从属类型:通常(我从Idris知道),向量的长度在可能的情况下由编译器消除。 It is intended for stronger type checking. 它用于更强大的类型检查。 At runtime the length information is useless and can be dropped. 在运行时,长度信息是无用的,可以丢弃。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM