简体   繁体   English

GHC如何处理核心中的类型类和实例?

[英]How does GHC handle typeclass and instance in core?

I compiled the following Haskell code to core: 我将以下Haskell代码编译为核心:

class FunClass a where
  functionInClass :: a -> ()

data MyData = MyData
data YourData = YourData

instance FunClass MyData where
  functionInClass a = ()
instance FunClass YourData where
  functionInClass a = ()

valueA :: ()
valueA = functionInClass MyData

valueB :: ()
valueB = functionInClass YourData

and got the following core bindings (I deleted some boilerplates that are irrelavant): 并获得以下核心绑定(我删除了一些不相关的样板):

 $cfunctionInClass :: MyData -> ()
 [LclId]
 $cfunctionInClass = \ _ [Occ=Dead] -> break<3>() ()

 $fFunClassMyData [InlPrag=INLINE (sat-args=0)] :: FunClass MyData
 $fFunClassMyData
   = $cfunctionInClass
     `cast` (Sym (N:FunClass[0] <MyData>_N)
             :: Coercible (MyData -> ()) (FunClass MyData))

 $cfunctionInClass :: YourData -> ()
 [LclId]
 $cfunctionInClass = \ _ [Occ=Dead] -> break<2>() ()

 $fFunClassYourData [InlPrag=INLINE (sat-args=0)] :: FunClass YourData
 $fFunClassYourData
   = $cfunctionInClass
     `cast` (Sym (N:FunClass[0] <YourData>_N)
             :: Coercible (YourData -> ()) (FunClass YourData))

 valueA :: ()
 [LclIdX]
 valueA
   = break<1>() functionInClass @ MyData $fFunClassMyData MyData

 valueB :: ()
 [LclIdX]
 valueB
   = break<0>()
     functionInClass @ YourData $fFunClassYourData YourData

My questions are: 我的问题是:

  1. Why do the two cfunctionInClass share the same name? 为什么两个cfunctionInClass共享相同的名称? How do we tell them apart? 我们如何区分它们?

  2. What does cast do exactly? cast到底能做什么?

  3. Is there anything related to typeclass/instance outside of mg_binds ModGuts ? mg_binds ModGuts之外是否有任何与mg_binds ModGuts / instance相关的mg_binds ModGuts

Without knowing (i) exactly which version of GHC, (ii) the exact ghc command line you used, and (iii) the full contents of the file you compiled, it's hard to duplicate the core output you're asking about, but here are some answers: 不知道(i)确切的GHC版本,(ii)使用的确切ghc命令行,以及(iii)编译文件的全部内容,很难复制您要查询的核心输出,但是这里有一些答案:

1) When generating your core, you probably supplied the flag -dsuppress-uniques , used some other flag that implied it, or maybe used an older version of GHC where it was the default. 1)生成内核时,您可能提供了-dsuppress-uniques标志,使用了其他暗示它的标志,或者使用了默认版本的GHC的较早版本。 This flag causes GHC to suppress from the core output the little random suffixes used to create unique names. 此标志使GHC从核心输出中消除用于创建唯一名称的少量随机后缀。 If you remove the flag or add an explicit -dno-suppress-uniques , you should see unique names like $cfunctionInClass_r1cH and $cfunctionInClass_r1dh . 如果删除标志或添加显式的-dno-suppress-uniques ,则应该看到唯一的名称,例如$cfunctionInClass_r1cH$cfunctionInClass_r1dh

2) Core is a typed language, and the function cast is used (extensively) to change the types of expressions. 2)Core是一种类型化的语言,并且(广泛地)使用函数cast转换来更改表达式的类型。 Note that it does not change the internal representation of the expression itself, so it can only be used to switch between types that have the same internal representation in memory. 请注意,它不会更改表达式本身的内部表示形式,因此只能用于在内存中具有相同内部表示形式的类型之间进行切换。

You'll see casts all over the place for code that uses newtypes . 您会在各处看到使用newtypes代码的newtypes For example the code: 例如代码:

newtype MyInt = MyInt Int
inc (MyInt n) = MyInt (n + 1)

creates the (unoptimized) core: 创建(未优化的)核心:

inc1 :: MyInt -> Int
inc1
  = \ (ds :: MyInt) ->
      + @ Int $fNumInt (ds `cast` (N:MyInt[0] :: MyInt ~R# Int)) (I# 1#)
inc :: MyInt -> MyInt
inc
  = inc1
    `cast` (<MyInt>_R ->_R Sym (N:MyInt[0])
            :: (MyInt -> Int) ~R# (MyInt -> MyInt))

with several casts. 有几个演员。

The way cast works, the left hand side of the `cast` operator is a normal core term (eg, variable or other expression) representing the value whose type is being changed; 的方式cast作品中,左手侧`cast`操作者是一个正常的核心术语(例如,可变或其他表达式)表示其类型被改变的值; the right hand side is something called a "coercion" which is a piece of evidence that the compiler constructs to prove that two types are representationally equivalent (ie, have the same in-memory representation and so can be safely coerced). 右侧是称为“强制”的东西,它是编译器构造的一种证据,用于证明两种类型在表示上是等效的(即,具有相同的内存中表示,因此可以安全地强制)。 For example, in my newtype example above, the coercion for the first cast: 例如,在上面的新类型示例中,第一次强制转换的强制是:

N:MyInt[0] :: MyInt ~R# Int

is a coercion value N:MyInt[0] whose type is the representational equality ( ~R# ) of MyInt and Int . 是强制值N:MyInt[0]其类型是MyIntInt的表示相等性( MyInt ~R# )。 (Technically, N:MyInt[0] is a coercion type whose kind is the given representational equality, but that distinction doesn't really matter.) If you're familiar with the Curry-Howard isomorphism, where values can be considered proofs of their types, this is an example of this in action deep in GHC's guts -- the value/type N:MyInt[0] proves its type/kind, namely the representational equality of the newtype and its contents, which allows the cast to take place. (从技术上讲, N:MyInt[0]是强制类型,类型是给定的表示相等性,但是这种区别并不重要。)如果您熟悉Curry-Howard同构,则可以将值视为证明的证明。它们的类型,这是GHC胆量深处的一个例子—值/类型N:MyInt[0]证明了它的类型/种类,即新类型及其内容的表示性相等,从而使演员可以地点。

In your example, the cast: 在您的示例中,强制转换:

$fFunClassMyData [InlPrag=INLINE (sat-args=0)] :: FunClass MyData
$fFunClassMyData
  = $cfunctionInClass
    `cast` (Sym (N:FunClass[0] <MyData>_N)
            :: Coercible (MyData -> ()) (FunClass MyData))

is a complex way of saying that GHC represents instance dictionaries for type classes having only one function much the same way it would represent a newtype containing a function of that type which is the same way it would represent the function value itself. 这是一种复杂的说法,即GHC表示类型类的实例字典,该实例类仅具有一个函数,与表示包含该类型的函数的新类型的方式几乎相同,与表示函数值本身的方式相同。 Therefore, the function value $cfunctionInClass can be directly cast to a dictionary value. 因此,函数值$cfunctionInClass可以直接转换为字典值。

However, if you added another function to your typeclass: 但是,如果您向类型类添加了另一个函数:

class FunClass a where
  functionInClass :: a -> ()
  anotherFunction :: a

the casts would disappear from the definition of the dictionaries, and they'd look more like what you'd expect: 这些强制转换将从字典的定义中消失,它们看起来更像您期望的那样:

$fFunClassMyData
$fFunClassMyData = C:FunClass $cfunctionInClass $canotherFunction

It's important to note that cast doesn't do anything in the final code. 重要的是要注意, cast在最终代码中不做任何事情。 Once the core is converted to untyped STG and eventually CMM and assembly, the cast calls are optimized out, as they don't affect the values, they only modify compile-time types to satisfy the core typechecker. 一旦核心转换为无类型STG并最终CMM和装配, cast调用优化掉了,因为它们不影响价值,他们只修改编译时间类型,以满足核心typechecker。 So, unless you're debugging GHC, you probably don't care about casts and should consider then no-ops. 因此,除非您正在调试GHC,否则您可能根本不在乎强制转换,因此应考虑不操作。 You can suppress some of the detail with -dsuppress-coercions (implied by -dsuppress-all ): 您可以使用-dsuppress-coercions (由-dsuppress-all-dsuppress-coercions一些细节:

$fFunClassYourData = $cfunctionInClass1 `cast` <Co:3>

and just pretend x `cast` <Co:xxx> is exactly equivalent to x . 并假装x `cast` <Co:xxx>完全等同于x In your example above, the dictionary is just the single instance function for the typeclass, so this is really the same up to coercible types as: 在你上面的例子中,字典只为类型类的单一实例功能,所以这是真的一样达到强制转换的类型为:

$fFunClassMyData = $cFunctionInClass

3) Sure. 3)当然。 Additional class and instance information is stored in the mg_tcs and mg_insts fields of ModGuts , respectively. 附加类和实例信息被存储在mg_tcsmg_insts领域ModGuts分别。 To a rough approximation, mg_binds contains the information needed for code generation while mg_tcs and mg_insts contain the information needed to generate the interface file. 大致来说, mg_binds包含代码生成所需的信息,而mg_tcsmg_insts包含生成接口文件所需的信息。

Helpful references to GHC compiler code 对GHC编译器代码的有用参考

ghc/compiler/coreSyn/PprCore.hs - Module for pretty-printing core. ghc/compiler/coreSyn/PprCore.hs漂亮打印核心模块。 If you want to know where something in core comes from, this is it. 如果您想知道核心产品的来源,那就是它。 (For example, ppr_expr add_par (Cast expr co) = ... is the code responsible for pretty printing `cast` operators. (例如, ppr_expr add_par (Cast expr co) = ...是负责漂亮地打印`cast`运算符的代码。

ghc/compiler/coreSyn/CoreSyn.hs - The Expr type is the "core" of core. ghc/compiler/coreSyn/CoreSyn.hs - Expr类型是core的“核心”。 The constructor Cast (Expr b) Coercion represents a cast. 构造函数Cast (Expr b) Coercion表示Cast (Expr b) Coercion

ghc/compiler/types/TycoRep.hs - The definition of Coercion is here. ghc/compiler/types/TycoRep.hs -的定义Coercion在这里。

ghc/compiler/main/HscTypes.hs - The definition of ModGuts and the "subsets" of fields CgGuts used for code generation and ModIface / ModDetails used for writing the interface file and linking. ghc/compiler/main/HscTypes.hs -的定义ModGuts和字段的“子集” CgGuts用于代码生成和ModIface / ModDetails用于写入接口文件和链接。

ghc/compiler/main/TidyPgm.hs - Definition of the function tidyGuts , where ModGuts information is split into CgGuts for code generation and ModDetails , a cached version of ModIface kept in memory when compiling multiple modules and/or used to generate a full ModIface to write out to an interface file. ghc/compiler/main/TidyPgm.hs -功能的定义tidyGuts其中, ModGuts信息被分成CgGuts用于代码生成和ModDetails ,的高速缓存版本ModIface编译多个模块时保存在存储器和/或用于产生一个完整的ModIface写入接口文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM