简体   繁体   English

Haskell中的数据类型设计

[英]Data type design in Haskell

Learning Haskell, I write a formatter of C++ header files. 学习Haskell,我编写了一个C ++头文件的格式化程序。 First, I parse all class members into a-collection-of-class-members which is then passed to the formatting routine. 首先,我将所有类成员解析为a-collection-of-class-members ,然后将其传递给格式化例程。 To represent class members I have 代表我有的班级成员

data ClassMember = CmTypedef Typedef |
                   CmMethod Method |
                   CmOperatorOverload OperatorOverload |
                   CmVariable Variable |
                   CmFriendClass FriendClass |
                   CmDestructor Destructor

(I need to classify the class members this way because of some peculiarities of the formatting style.) (由于格式化风格的一些特殊性,我需要以这种方式对类成员进行分类。)

The problem that annoys me is that to "drag" any function defined for the class member types to the ClassMember level, I have to write a lot of redundant code. 让我烦恼的问题是,要将为类成员类型定义的任何函数“拖动”到ClassMember级别,我必须编写大量冗余代码。 For example, 例如,

instance Formattable ClassMember where
    format (CmTypedef td) = format td
    format (CmMethod m) = format m
    format (CmOperatorOverload oo) = format oo
    format (CmVariable v) = format v
    format (CmFriendClass fc) = format fc
    format (CmDestructor d) = format d

instance Prettifyable ClassMember where
    -- same story here

On the other hand, I would definitely like to have a list of ClassMember objects (at least, I think so), hence defining it as 另一方面,我肯定希望有一个ClassMember对象列表(至少,我认为是这样),因此将其定义为

data ClassMember a = ClassMember a

instance Formattable ClassMember a
    format (ClassMember a) = format a

doesn't seem to be an option. 似乎不是一个选择。

The alternatives I'm considering are: 我正在考虑的替代方案是:

  1. Store in ClassMember not object instances themselves, but functions defined on the corresponding types, which are needed by the formatting routine. 存储在ClassMember不是对象实例本身,而是在相应类型上定义的函数,这些函数是格式化例程所需的。 This approach breaks the modularity, IMO, as the parsing results, represented by [ClassMember] , need to be aware of all their usages. 这种方法打破了模块化,IMO,因为[ClassMember]代表的解析结果需要了解它们的所有用法。

  2. Define ClassMember as an existential type, so [ClassMember] is no longer a problem. ClassMember定义为存在类型,因此[ClassMember]不再是问题。 I doubt whether this design is strict enough and, again, I need to specify all constraints in the definition, like data ClassMember = forall a . Formattable a => ClassMember a 我怀疑这个设计是否足够严格,同样,我需要在定义中指定所有约束,例如data ClassMember = forall a . Formattable a => ClassMember a data ClassMember = forall a . Formattable a => ClassMember a . data ClassMember = forall a . Formattable a => ClassMember a Also, I would prefer a solution without using extensions. 此外,我更喜欢不使用扩展的解决方案。

Is what I'm doing a proper way to do it in Haskell or there is a better way? 我正在以正确的方式在Haskell中做到这一点还是有更好的方法?

First, consider trimming down that ADT a bit. 首先,考虑稍微削减ADT。 Operator overloads and destructors are special kinds of methods, so it might make more sense to treat all three in CmMethod ; 运算符重载和析构函数是特殊的方法,因此在CmMethod处理所有三个方法可能更有意义; Method will then have special ways to separate them. 然后, Method将有特殊的方法来分隔它们。 Alternatively, keep all three CmMethod , CmOperatorOverload , and CmDestructor , but let them all contain the same Method type. 或者,保留所有三个CmMethodCmOperatorOverloadCmDestructor ,但让它们都包含相同的Method类型。

But of course, you can reduce the complexity only so much. 但是,当然,你可以减少这么多的复杂性。

As for the specific example of a Show instance: you really don't want to write that yourself except in some special cases. 至于Show实例的具体示例:除了某些特殊情况外,你真的不想自己编写。 For your case, it's much more reasonable to have the instance derived automatically: 对于您的情况,自动派生实例更合理:

data ClassMember = CmTypedef Typedef
                 | CmMethod Method
                 | ...
                 | CmDestructor Destructor
                 deriving (Show)

This will give different results from your custom instance – because yours is wrong: showing a contained result should also give information about the constructor. 这将给您的自定义实例提供不同的结果 - 因为您的错误:显示包含的结果还应该提供有关构造函数的信息。

If you're not really interested in Show but talking about another class C that does something more specific to ClassMember s – well, then you probably shouldn't have defined C in the first place! 如果你对Show不是真的感兴趣,而是谈论另一个C类,它会对ClassMember做更具体的事情 - 那么你可能不应该首先定义C The purpose of type classes is to express mathematical concepts that hold for a great variety of types. 类型类的目的是表达适用于各种类型的数学概念。

A possible solution is to use records. 一种可能的解决方案是使用记录。 It can be used without extensions and preserves flexibility. 它可以在没有扩展的情况下使用并保持灵活性。

There is still some boilerplate code, but you need to type it only once for all. 仍然有一些样板代码,但您只需要输入一次。 So if you would need to perform another set of operations over your ClassMember, it would be very easy and quick to do it. 因此,如果您需要在ClassMember上执行另一组操作,则可以非常轻松快速地执行此操作。

Here is an example for your particular case (template Haskell and Control.Lens makes things easier but are not mandatory): 以下是您的特定情况的示例(模板Haskell和Control.Lens使事情变得更容易但不是强制性的):

{-# LANGUAGE TemplateHaskell #-}

module Test.ClassMember

import Control.Lens

-- | The class member as initially defined.
data ClassMember =
      CmTypedef Typedef
    | CmMethod Method
    | CmOperatorOverload OperatorOverload
    | CmVariable Variable
    | CmFriendClass FriendClass
    | CmDestructor Destructor

-- | Some dummy definitions of the data types, so the code will compile.
data Typedef = Typedef
data Method = Method
data OperatorOverload = OperatorOverload
data Variable = Variable
data FriendClass = FriendClass
data Destructor = Destructor

{-|
A data type which defines one function per constructor.
Note the type a, which means that for a given Hanlder "a" all functions
must return "a" (as for a type class!).
-}
data Handler a = Handler
    {
      _handleType        :: Typedef -> a
    , _handleMethod      :: Method -> a
    , _handleOperator    :: OperatorOverload -> a
    , _handleVariable    :: Variable -> a
    , _handleFriendClass :: FriendClass -> a
    , _handleDestructor  :: Destructor -> a
    }

{-|
Here I am using lenses. This is not mandatory at all, but makes life easier.
This is also the reason of the TemplateHaskell language pragma above.
-}
makeLenses ''Handler

{-|
A function acting as a dispatcher (the boilerplate code!!!), telling which
function of the handler must be used for a given constructor.
-}
handle :: Handler a -> ClassMember -> a
handle handler member =
    case member of
        CmTypedef a          -> handler^.handleType $ a 
        CmMethod a           -> handler^.handleMethod $ a
        CmOperatorOverload a -> handler^.handleOperator $ a
        CmVariable a         -> handler^.handleVariable $ a
        CmFriendClass a      -> handler^.handleFriendClass $ a
        CmDestructor a)      -> handler^.handleDestructor $ a

{-|
A dummy format method.
I kept things simple here, but you could define much more complicated
functions.

You could even define some generic functions separately and... you could define
them with some extra arguments that you would only provide when building
the Handler! An (dummy!) example is the way the destructor function is
constructed.
-}
format :: Handler String
format = Handler
    (\x -> "type")
    (\x -> "method")
    (\x -> "operator")
    (\x -> "variable")
    (\x -> "Friend")
    (destructorFunc $ (++) "format ")

{-|
A dummy function showcasing partial application.
It has one more argument than handleDestructor. In practice you are free
to add as many as you wish as long as it ends with the expected type
(Destructor -> String).
-}
destructorFunc :: (String -> String) -> Destructor -> String
destructorFunc f _ = f "destructor"

{-|
Construction of the pretty handler which illustrates the reason why
using lens by keeping a nice and concise syntax.

The "&" is the backward operator and ".~" is the set operator.
All we do here is to change the functions of the handleType and the
handleDestructor.
-}
pretty :: Handler String
pretty = format & handleType       .~ (\x -> "Pretty type")
                & handleDestructor .~ (destructorFunc ((++) "Pretty "))

And now we can run some tests: 现在我们可以运行一些测试:

test1 = handle format (CmDestructor Destructor)
> "format destructor"

test2 = handle pretty (CmDestructor Destructor)
> "Pretty destructor"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM