简体   繁体   English

将类型类及其实例拆分为 Haskell 中的不同子模块

[英]Splitting type-classes and their instances to the different submodules in Haskell

I am currently writing a small helper library and I've faced the problem of really huge source code in one of the modules.我目前正在编写一个小型帮助程序库,但我遇到了其中一个模块中源代码非常庞大的问题。 Basically, I am declaring a new parametric type-class and want to implement it for two different monad stacks.基本上,我正在声明一个新的参数类型类,并希望为两个不同的 monad 堆栈实现它。

I've decided to split the declaration of type-class and its implementations to the different modules, but I'm constantly getting warnings about orphaned instances.我决定将类型类的声明及其实现拆分到不同的模块中,但我不断收到有关孤立实例的警告。

As I know, that might happen if it is possible to import a datatype without an instance, ie if they are in a different module.据我所知,如果可以在没有实例的情况下导入数据类型,即如果它们位于不同的模块中,则可能会发生这种情况。 But I have both type declaration and instance implementation inside each module.但是我在每个模块中都有类型声明和实例实现。

To simplify the whole example, here is what I have now: First is the module, where I define a type-class为了简化整个示例,这是我现在拥有的:首先是模块,我在其中定义了一个类型类

-- File ~/library/src/Lib/API.hs 
module Lib.API where

-- Lots of imports

class (Monad m) => MyClass m where
  foo :: String -> m () 
  -- More functions are declared

Then the module with instance implementation:然后是带有实例实现的模块:

-- File ~/library/src/Lib/FirstImpl.hs
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE FlexibleInstances #-}
module Lib.FirstImpl where

import Lib.API
import Data.IORef
import Control.Monad.Reader

type FirstMonad = ReaderT (IORef String) IO

instance MyClass FirstMonad where
  foo = undefined

Both of them are listed in my project's.cabal file, it's also impossible to use FirstMonad without the instance because they are defined in one file.它们都列在我的项目的 .cabal 文件中,没有实例也不可能使用FirstMonad ,因为它们是在一个文件中定义的。

However, when I launch ghci using stack ghci lib , I'm getting the next warning:但是,当我使用stack ghci lib启动 ghci 时,我收到了下一个警告:

~/library/src/Lib/FirstImpl.hs:11:1: warning: [-Worphans]
    Orphan instance: instance MyClass FirstMonad
    To avoid this
        move the instance declaration to the module of the class or of the type, or
        wrap the type with a newtype and declare the instance on the new type.
   |
11 | instance MyClass FirstMonad where
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^...
Ok, two modules loaded

What am I missing and is there any way to split type-class declarations and their implementations into the different submodules?我缺少什么,有什么方法可以将类型类声明及其实现拆分为不同的子模块?

To avoid this, you can wrap type in newtype为避免这种情况,您可以将类型包装在newtype

newtype FirstMonad a = FirstMonad (ReaderT (IORef String) IO a)

But after deep considering you feel need orphan instances, you can suppress warnings:但是在深入考虑你觉得需要孤儿实例之后,你可以抑制警告:

{-# OPTIONS_GHC -fno-warn-orphans #-}

Detail细节

Coherence一致性

For example, considering following definition for now:例如,现在考虑以下定义:

data A = A

instance Eq A where
   ...

It can be regarded as type based overloading.它可以被视为基于类型的重载。 In the above, Checking equality (==) is able to be used under various types:上面的 Checking equality (==)可以在多种类型下使用:

f :: Eq a => a -> a -> a -> Bool
f x y z = x == y && y == z

g :: A -> A -> A -> Bool
g x y z = x == y && y == z

In definition of f , type a is abstract and under constraint Eq , but in g , type A is concrete.f的定义中,类型a是抽象的并且在约束Eq下,但在g中,类型A是具体的。 The former derives method from constrains, but Haskell also in the latter can derive.前者从constraints中导出方法,而Haskell也在后者中可以导出。 How to derive is to just elaborate Haskell into language which has no type class. This way is called dictionary passing .推导方法就是将Haskell 精化成没有类型class 的语言。这种方式称为字典传递

class C a where
  m1 :: a -> a

instance C A where
  m1 x = x

f :: C a => a -> a
f = m1 . m1

It will be converted:它将被转换:

data DictC a = DictC
  { m1 :: a -> a
  }

instDictC_A :: DictC A
instDictC_A = DictC
  { m1 = \x -> x
  }

f :: DictC a -> a -> a
f d = m1 d . m1 d

As the above, make a data type called dictionary corresponds to a type class, and pass the value of the type.如上,让一个名为dictionary的数据类型对应一个类型class,并传递该类型的值。

Haskell has a constraint that a type may not be declared as an instance of a particular class more than once in the program . Haskell 有一个约束,即一个类型不能在程序中多次声明为特定 class 的实例 This causes various problems.这会导致各种问题。

class C1 a where
  m1 :: a

class C1 a => C2 a where
  m2 :: a -> a

instance C1 Int where
  m1 = 0

instance C2 Int where
  m2 x = x + 1

f :: (C1 a, C2 a) => a
f = m2 m1

g :: Int
g = f

This code uses inheritance of type class. It derives following elaborated code.此代码使用类型为 class 的 inheritance。它派生出以下详细代码。

  { m1 :: a
  }

data DictC2 a = DictC2
  { superC1 :: DictC1 a
  , m2 :: a -> a
  }

instDictC1_Int :: DictC1 Int
instDictC1_Int = DictC1
  { m1 = 0
  }

instDictC2_Int :: DictC2 Int
instDictC2_Int = DictC2
  { superC1 = instDictC1_Int
  , m2 = \x -> x + 1
  }

f :: DictC1 a -> DictC2 a -> a
f d1 d2 = ???

g :: Int
g = f instDictC1_Int instDictC2_Int

Well, what is definition of f going on?那么, f的定义是什么? Actually, Definition's' are following:实际上,定义如下:

f :: DictC1 a -> DictC2 a -> a
f d1 d2 = m2 d2 (m1 d1)

f :: DictC1 a -> DictC2 a -> a
f _ d2 = m2 d2 (m1 d1)
  where
    d1 = superC1 d2

Do you confirm it has no problem in typing?你确定打字没有问题吗? If Haskell can define Int as a instance of C1 repeatedly, superC1 in DictC2 will be filled in elaboration, the value will be probably defferent from DictC1 a passed to f when call g .如果 Haskell 可以重复定义IntC1的一个实例, superC1中的DictC2将被填充,其值可能与调用g时传递给fDictC1 a不同。

Let's see more example:让我们看更多的例子:

h :: (Int, Int)
h = (m1, m1)

Of course, elaboration is one:当然,阐述是一个:

h :: (Int, Int)
h = (m1 instDictC1_Int, m1 instDictC1_Int)

But if can define instance repeatedly, can also consider following elaboration:但是如果可以重复定义instance,也可以考虑如下阐述:

h :: (Int, Int)
h = (m1 instDictC1_Int, m1 instDictC1_Int')

Hence, two same types are applied two different instances.因此,两个相同的类型应用于两个不同的实例。 For example, calling same function twice, but returns different value by different algorithm possibly.例如,调用同一个 function 两次,但可能通过不同的算法返回不同的值。

The stated example is little bit exaggerated, though how about next example?上述例子有点夸张,但下一个例子呢?

instance C1 Int where
  m1 = 0

h1 :: Int
h1 = m1

instance C1 Int where
  m1 = 1

h2 :: (Int, Int)
h2 = (m1, h1)

In this case, quite possibly use different instances m1 in h1 and m1 in h2 .在这种情况下,很可能在h1中使用不同的实例m1 ,在h2中使用m1 Haskell often prefers to transformation based on equational reasoning , so it will be a problem that h1 is not able to be replaced directly to m1 . Haskell 往往更喜欢基于 等式推理的变换,所以h1不能直接替换为m1将是一个问题。

Generally, type system include resolving instances of type classes.通常,类型系统包括解析类型类的实例。 In such a case, resolve instances when check types.在这种情况下,请在检查类型时解析实例。 And codes are elaborated by derivation tree made during checking types.代码是通过检查类型时制作的派生树来详细说明的。 Such transformation is sometimes adapted by besides type class, specifically, implicit type conversion, record type and so on.这种转换有时除了类型 class 外,还有隐式类型转换、记录类型等。 Then, these cases possibly cause the problem as the above.那么,这些情况可能会导致上述问题。 This problem can formalized following:这个问题可以形式化如下:

When convert derivation tree of type into language, in two different derivation tree of one type, results of conversion don't become semantically equivalent.将类型的派生树转换为语言时,在同一类型的两个不同的派生树中,转换的结果在语义上并不等价。

As the stated, even apply whatever instance matches type, and it generally must pass type checking.如前所述,即使应用与类型匹配的任何实例,它通常也必须通过类型检查。 However, a result of elaboration by using a instance is possibly different a result of elaboration after resolving other instance.但是,使用一个实例进行细化的结果可能与解析其他实例后进行细化的结果不同。 Vice versa, if don't have this problem, can acquire certain guarantee of type system.反之亦然,如果没有这个问题,可以获得类型系统的一定保证。 This guarantee, a combination of type system which the problem formalized above doesn't work and property pf elaboration, is generally called coherence .这种保证,上面形式化的问题不起作用的类型系统和详细说明的属性的组合,通常称为一致性 There are some way to guarantee coherence, Haskell limits number of instance definition corresponding type class to one in order to guarantee coherence.有一些方法可以保证一致性,Haskell 将实例定义对应类型 class 的数量限制为一个,以保证一致性。

Orphan Instance孤儿实例

How Haskell does is easy to say, but has some issues. Haskell 是怎么做的说起来容易,但也有一些问题。 Quite famous one is orphan instance .比较有名的一个是孤儿实例 GHC, in a type declaration T as an instance of C , treatment of instance depends on whether or not the declaration is in a same module which has declaration T or C . GHC,在类型声明T作为C实例的情况下,实例的处理取决于声明是否位于具有声明TC的同一模块中。 Especially, not in same module, called orphan instance, GHC will warn.特别是,不在同一个模块中,称为孤儿实例,GHC 会发出警告。 Why how it works?为什么它是如何工作的?

First, in Haskell, instances propagate implicitly between modules.首先,在 Haskell 中,实例在模块之间隐式传播。 This is stipulated as following:规定如下:

All instances in scope within a module are always exported and any import brings all instances in from the imported module.模块内 scope 中的所有实例始终被导出,并且任何导入都会从导入的模块中引入所有实例。 Thus, an instance declaration is in scope if and only if a chain of import declarations leads to the module containing the instance declaration.因此,当且仅当导入声明链导致包含实例声明的模块时,实例声明在 scope 中。 -- 5 Modules -- 5个模块

We can't stop this, can't control this.我们无法阻止,也无法控制。 In the first place, Haskell decided to let us define one type as one instance, so it's unnecessary to mind it.本来Haskell就决定让我们把一个类型定义为一个实例,所以不用介意。 By the way, it's as good there is such regulation, actually compiler of Haskell must resolve instances according to the regulation.顺便说一句,有这样的规定就好了,实际上Haskell的编译器必须按照规定解析实例。 Of course, compiler doesn't know which modules have instances, must check all modules at worst case.当然,编译器不知道哪些模块有实例,在最坏的情况下必须检查所有模块。 It also bothers us.这也困扰着我们。 If two important modules hold each instance definition toward same type, all modules which have their import chains include the modules become unavailable in order to conflict.如果两个重要模块将每个实例定义都指向同一类型,则所有具有导入链的模块都包含这些模块,以便发生冲突而变得不可用。

Well, to use a type as a instance of a class, we need information of them, so we will go to see a module which has declarations.好吧,要将类型用作 class 的实例,我们需要它们的信息,所以我们将 go 看一个有声明的模块。 Then, that a third party fiddles the module is not going to happen.那么,第三方篡改模块的情况就不会发生。 Therefore, if either of the modules includes the instance declaration, compiler can see necessary information with instances, we are happy that enable to load modules guarantees that they have no conflicts.因此,如果任何一个模块包含实例声明,编译器可以看到与实例相关的必要信息,我们很高兴启用加载模块保证它们没有冲突。 For that reason, that a type as an instance of a class placed in a same module which has declaration the type or the class is being recommended.出于这个原因,建议将类型作为 class 的实例放置在具有声明类型或 class 的同一模块中。 On the contrary, avoiding orphan instance as much as possible is being recommended.相反,建议尽可能避免孤儿实例。 Hence, if want to make a type as a independent instance, making a new type by newtype in order to only change semantics of a instance, declaring the type as the instance.因此,如果想使一个类型成为一个独立的实例,则通过newtype创建一个新类型,以便仅更改实例的语义,将类型声明为实例。

In addition, GHC marks up internally modules have orphan instances, modules have orphan instances are enumerated in their dependent modules' interface files.此外,GHC 内部标记模块有孤儿实例,模块有孤儿实例在其依赖模块的接口文件中被枚举。 And then, compiler refers all of the list.然后,编译器引用所有列表。 Thus, to make orphan instance once, an interface file of a module which has the instance, when all modules depend on the module recompile, will reloaded if whatever changes.因此,为了使孤儿实例一次,具有该实例的模块的接口文件,当所有依赖于该模块的模块重新编译时,如果发生任何变化,将重新加载。 So, orphan instance affects bad to compile time.所以,孤儿实例对编译时间有不好的影响。

Detail is under CC BY-SA 4.0 (C) Mizunashi Mana详情在CC BY-SA 4.0 (C) Mizunashi Mana 下

Original is 続くといいな日記 – 型クラスの Coherence と Orphan Instance原作是続くといな日记 – 型クラスの Coherence to Orphan Instance

2020-12-22 revised and translated by Akihito Kirisaki 2020-12-22 雾崎明仁修译

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM