简体   繁体   English

数据类型定义的限制

[英]Restriction on the data type definition

I have a type synonym type Entity = ([Feature], Body) for whatever Feature and Body mean. 对于FeatureBody意思type Entity = ([Feature], Body)我都有一个类型同义词type Entity = ([Feature], Body) Objects of Entity type are to be grouped together: Entity类型的对象将被分组在一起:

type Bunch = [Entity]

and the assumption, crucial for the algorithm working with Bunch , is that any two entities in the same bunch have the equal number of features. 对于使用Bunch的算法而言,至关重要的假设是同一束中的任何两个实体具有相同数量的特征。

If I were to implement this constraint in an OOP language, I would add the corresponding check to the method encapsulating the addition of entities into a bunch. 如果我要用OOP语言实现此约束,则需要将相应的检查添加到将实体添加成一堆的方法中。 Is there a better way to do it in Haskell? 在Haskell中有更好的方法吗? Preferably, on the definition level. 优选地,在定义水平上。 (If the definition of Entity also needs to be changed, no problem.) (如果Entity的定义也需要更改,则没问题。)

Using type-level length annotations 使用类型级长度注释

So here's the deal. 所以这是交易。 Haskell does have type-level natural numbers and you can annotate with types using "phantom types". Haskell确实具有类型级别的自然数 ,您可以使用“幻像类型”对类型进行注释。 However you do it, the types will look like this: 无论您执行哪种操作,类型都将如下所示:

data Z
data S n
data LAList x len = LAList [x] -- length-annotated list

Then you can add some construction functions for convenience: 然后,您可以添加一些构造函数以方便使用:

lalist1 :: x -> LAList x (S Z)
lalist1 x = LAList [x]
lalist2 :: x -> x -> LAList x (S (S Z))
lalist2 x y = LAList [x, y]
-- ...

And then you've got more generic methods: 然后,您有了更多通用方法:

(~:) :: x -> LAList x n -> LAList x (S n)
x ~: LAList xs = LAList (x : xs)
infixr 5 ~:

nil :: LAList x Z
nil = LAList []

lahead :: LAList x (S n) -> x
lahead (LAList xs) = head xs

latail :: LAList x (S n) -> LAList x n
latail (LAList xs) = tail xs

but by itself the List definition doesn't have any of this because it's complicated. 但List定义本身并没有任何定义,因为它很复杂。 You may be interested in the Data.FixedList package for a somewhat different approach, too. 您可能也对Data.FixedList包感兴趣,但使用的方法有所不同。 Basically every approach is going to start off looking a little weird with some data type that has no constructor, but it starts to look normal after a little bit. 基本上,每种方法都会从没有构造函数的某种数据类型开始看起来有些怪异,但过一会儿它看起来就变得正常了。

You might also be able to get a typeclass so that all of the lalist1 , lalist2 operators above can be replaced with 也许还可以获取类型类,以便可以将上述所有lalist1lalist2运算符替换为

class FixedLength t where
    la :: t x -> LAList x n

but you will probably need the -XTypeSynonymInstances flag to do this, as you want to do something like 但是您可能需要-XTypeSynonymInstances标志来执行此操作,因为您想执行以下操作

type Pair x = (x, x)
instance FixedLength Pair where
    la :: Pair x -> LAList [x] (S (S Z))
    la (a, b) = LAList [a, b]

(it's a kind mismatch when you go from (a, b) to Pair a ). (当您从(a, b) Pair a (a, b)这是一种不匹配)。

Using runtime checking 使用运行时检查

You can very easily take a different approach and encapsulate all of this as a runtime error or explicitly model the error in your code: 您可以轻松地采用其他方法,并将所有方法封装为运行时错误,或在代码中显式地对错误进行建模:

-- this may change if you change your definition of the Bunch type
features :: Entity -> [Feature]
features = fst 

-- we also assume a runBunch :: [Entity] -> Something function 
-- that you're trying to run on this Bunch.

allTheSame :: (Eq x) => [x] -> Bool
allTheSame (x : xs) = all (x ==) xs
allTheSame [] = True

permissiveBunch :: [Entity] -> Maybe Something
permissiveBunch es
  | allTheSame (map (length . features) es) = Just (runBunch es)
  | otherwise = Nothing

strictBunch :: [Entity] -> Something
strictBunch es 
  | allTheSame (map (length . features) es) = runBunch es
  | otherwise = error ("runBunch requires all feature lists to be the same length; saw instead " ++ show (map (length . features) es))

Then your runBunch can just assume that all the lengths are the same and it's explicitly checked for above. 然后,您的runBunch可以假定所有长度都相同,并且已在上面进行了明确检查。 You can get around pattern-matching weirdnesses with, say, the zip :: [a] -> [b] -> [(a, b)] function in the Prelude, if you need to pair up the features next to each other. 如果需要将特征彼此配对,则可以使用Prelude中的zip :: [a] -> [b] -> [(a, b)]函数来解决模式匹配的怪异问题。 。 (The goal here would be an error in an algorithm due to pattern-matching for both runBunch' (x:xs) (y:ys) and runBunch' [] [] but then Haskell warns that there are 2 patterns which you've not considered in the match.) (这里的目标是由于runBunch' (x:xs) (y:ys)runBunch' [] []模式匹配而导致算​​法错误,但是Haskell警告您有2种模式不在比赛中。)

Using tuples and type classes 使用元组和类型类

One final way to do it which is a compromise between the two (but makes for pretty good Haskell code) involves making Entity parametrized over all features: 最后一种实现方法是在两者之间进行折衷(但是使Haskell代码变得相当不错)涉及对所有功能进行实体参数化:

type Entity x = (x, Body)

and then including a function which can zip different entities of different lengths together: 然后包含一个可以将不同长度的不同实体压缩在一起的函数:

class ZippableFeatures z where
    fzip :: z -> z -> [(Feature, Feature)]

instance ZippableFeatures () where
    fzip () () = []

instance ZippableFeatures Feature where
    fzip f1 f2 = [(f1, f2)]

instance ZippableFeatures (Feature, Feature) where
    fzip (a1, a2) (b1, b2) = [(a1, b1), (a2, b2)]

Then you can use tuples for your feature lists, as long as they don't get any larger than the maximum tuple length (which is 15 on my GHC). 然后,可以将元组用于功能列表,只要它们的长度不超过最大元组长度(在我的GHC中为15)即可。 If you go larger than that, of course, you can always define your own data types, but it's not going to be as general as type-annotated lists. 当然,如果超出这个范围,则始终可以定义自己的数据类型,但是它不会像带类型注释的列表那样普遍。

If you do this, your type signature for runBunch will simply look like: 如果执行此操作,则runBunch的类型签名将如下所示:

 runBunch :: (ZippableFeatures z) => [Entity z] -> Something

When you run it on things with the wrong number of features you'll get compiler errors that it can't unify the type (a, b) with (a, b, c). 当您在功能数量错误的事物上运行它时,会遇到编译器错误,即无法将(a,b)与(a,b,c)统一。

There are various ways to enforce length constraints like that; 有多种方法可以强制执行这样的长度限制; here's one: 这是一个:

{-# LANGUAGE DataKinds, KindSignatures, GADTs, TypeFamilies #-}
import Prelude hiding (foldr)
import Data.Foldable
import Data.Monoid
import Data.Traversable
import Control.Applicative

data Feature  -- Whatever that really is

data Body  -- Whatever that really is

data Nat = Z | S Nat  -- Natural numbers

type family Plus (m::Nat) (n::Nat) where  -- Type level natural number addition
  Plus Z n = n
  Plus (S m) n = S (Plus m n)

data LList (n :: Nat) a where  -- Lists tagged with their length at the type level
  Nil :: LList Z a
  Cons :: a -> LList n a -> LList (S n) a

Some functions on these lists: 这些列表上的一些功能:

llHead :: LList (S n) a -> a
llHead (Cons x _) = x

llTail :: LList (S n) a -> LList n a
llTail (Cons _ xs) = xs

llAppend :: LList m a -> LList n a -> LList (Plus m n) a
llAppend Nil ys = ys
llAppend (Cons x xs) ys = Cons x (llAppend xs ys)

data Entity n = Entity (LList n Feature) Body

data Bunch where
   Bunch :: [Entity n] -> Bunch

Some instances: 一些实例:

instance Functor (LList n) where
   fmap f Nil = Nil
   fmap f (Cons x xs) = Cons (f x) (fmap f xs)

instance Foldable (LList n) where
   foldMap f Nil = mempty
   foldMap f (Cons x xs) = f x `mappend` foldMap f xs

instance Traversable (LList n) where
   traverse f Nil = pure Nil
   traverse f (Cons x xs) = Cons <$> f x <*> traverse f xs

And so on. 等等。 Note that n in the definition of Bunch is existential . 注意Bunch的定义中的n存在的 It can be anything, and what it actually is doesn't affect the type—all bunches have the same type. 它可以是任何东西,实际上它不会影响类型-所有串都具有相同的类型。 This limits what you can do with bunches to a certain extent. 这在一定程度上限制了您可以使用束进行的操作。 Alternatively, you can tag the bunch with the length of its feature lists. 或者,您可以用特征列表的长度标记该束。 It all depends what you need to do with this stuff in the end. 最终,这取决于您需要使用这些东西什么。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM