简体   繁体   中英

Data type design in Haskell

Learning Haskell, I write a formatter of C++ header files. First, I parse all class members into a-collection-of-class-members which is then passed to the formatting routine. To represent class members I have

data ClassMember = CmTypedef Typedef |
                   CmMethod Method |
                   CmOperatorOverload OperatorOverload |
                   CmVariable Variable |
                   CmFriendClass FriendClass |
                   CmDestructor Destructor

(I need to classify the class members this way because of some peculiarities of the formatting style.)

The problem that annoys me is that to "drag" any function defined for the class member types to the ClassMember level, I have to write a lot of redundant code. For example,

instance Formattable ClassMember where
    format (CmTypedef td) = format td
    format (CmMethod m) = format m
    format (CmOperatorOverload oo) = format oo
    format (CmVariable v) = format v
    format (CmFriendClass fc) = format fc
    format (CmDestructor d) = format d

instance Prettifyable ClassMember where
    -- same story here

On the other hand, I would definitely like to have a list of ClassMember objects (at least, I think so), hence defining it as

data ClassMember a = ClassMember a

instance Formattable ClassMember a
    format (ClassMember a) = format a

doesn't seem to be an option.

The alternatives I'm considering are:

  1. Store in ClassMember not object instances themselves, but functions defined on the corresponding types, which are needed by the formatting routine. This approach breaks the modularity, IMO, as the parsing results, represented by [ClassMember] , need to be aware of all their usages.

  2. Define ClassMember as an existential type, so [ClassMember] is no longer a problem. I doubt whether this design is strict enough and, again, I need to specify all constraints in the definition, like data ClassMember = forall a . Formattable a => ClassMember a data ClassMember = forall a . Formattable a => ClassMember a . Also, I would prefer a solution without using extensions.

Is what I'm doing a proper way to do it in Haskell or there is a better way?

First, consider trimming down that ADT a bit. Operator overloads and destructors are special kinds of methods, so it might make more sense to treat all three in CmMethod ; Method will then have special ways to separate them. Alternatively, keep all three CmMethod , CmOperatorOverload , and CmDestructor , but let them all contain the same Method type.

But of course, you can reduce the complexity only so much.

As for the specific example of a Show instance: you really don't want to write that yourself except in some special cases. For your case, it's much more reasonable to have the instance derived automatically:

data ClassMember = CmTypedef Typedef
                 | CmMethod Method
                 | ...
                 | CmDestructor Destructor
                 deriving (Show)

This will give different results from your custom instance – because yours is wrong: showing a contained result should also give information about the constructor.

If you're not really interested in Show but talking about another class C that does something more specific to ClassMember s – well, then you probably shouldn't have defined C in the first place! The purpose of type classes is to express mathematical concepts that hold for a great variety of types.

A possible solution is to use records. It can be used without extensions and preserves flexibility.

There is still some boilerplate code, but you need to type it only once for all. So if you would need to perform another set of operations over your ClassMember, it would be very easy and quick to do it.

Here is an example for your particular case (template Haskell and Control.Lens makes things easier but are not mandatory):

{-# LANGUAGE TemplateHaskell #-}

module Test.ClassMember

import Control.Lens

-- | The class member as initially defined.
data ClassMember =
      CmTypedef Typedef
    | CmMethod Method
    | CmOperatorOverload OperatorOverload
    | CmVariable Variable
    | CmFriendClass FriendClass
    | CmDestructor Destructor

-- | Some dummy definitions of the data types, so the code will compile.
data Typedef = Typedef
data Method = Method
data OperatorOverload = OperatorOverload
data Variable = Variable
data FriendClass = FriendClass
data Destructor = Destructor

{-|
A data type which defines one function per constructor.
Note the type a, which means that for a given Hanlder "a" all functions
must return "a" (as for a type class!).
-}
data Handler a = Handler
    {
      _handleType        :: Typedef -> a
    , _handleMethod      :: Method -> a
    , _handleOperator    :: OperatorOverload -> a
    , _handleVariable    :: Variable -> a
    , _handleFriendClass :: FriendClass -> a
    , _handleDestructor  :: Destructor -> a
    }

{-|
Here I am using lenses. This is not mandatory at all, but makes life easier.
This is also the reason of the TemplateHaskell language pragma above.
-}
makeLenses ''Handler

{-|
A function acting as a dispatcher (the boilerplate code!!!), telling which
function of the handler must be used for a given constructor.
-}
handle :: Handler a -> ClassMember -> a
handle handler member =
    case member of
        CmTypedef a          -> handler^.handleType $ a 
        CmMethod a           -> handler^.handleMethod $ a
        CmOperatorOverload a -> handler^.handleOperator $ a
        CmVariable a         -> handler^.handleVariable $ a
        CmFriendClass a      -> handler^.handleFriendClass $ a
        CmDestructor a)      -> handler^.handleDestructor $ a

{-|
A dummy format method.
I kept things simple here, but you could define much more complicated
functions.

You could even define some generic functions separately and... you could define
them with some extra arguments that you would only provide when building
the Handler! An (dummy!) example is the way the destructor function is
constructed.
-}
format :: Handler String
format = Handler
    (\x -> "type")
    (\x -> "method")
    (\x -> "operator")
    (\x -> "variable")
    (\x -> "Friend")
    (destructorFunc $ (++) "format ")

{-|
A dummy function showcasing partial application.
It has one more argument than handleDestructor. In practice you are free
to add as many as you wish as long as it ends with the expected type
(Destructor -> String).
-}
destructorFunc :: (String -> String) -> Destructor -> String
destructorFunc f _ = f "destructor"

{-|
Construction of the pretty handler which illustrates the reason why
using lens by keeping a nice and concise syntax.

The "&" is the backward operator and ".~" is the set operator.
All we do here is to change the functions of the handleType and the
handleDestructor.
-}
pretty :: Handler String
pretty = format & handleType       .~ (\x -> "Pretty type")
                & handleDestructor .~ (destructorFunc ((++) "Pretty "))

And now we can run some tests:

test1 = handle format (CmDestructor Destructor)
> "format destructor"

test2 = handle pretty (CmDestructor Destructor)
> "Pretty destructor"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM