简体   繁体   中英

Why do we need “Algebraic Data Types”?

I've read some explanation of Algebraic Data Types:

These articles give very detail description and code samples.

At first I was thinking Algebraic Data Type was just for defining some types easily and we can match them with pattern matching. But after reading these articles, I found "pattern matching" is not even mentioned there, and the content looks interesting but far more complex than I expected.

So I have some questions (which are not answered in these articles):

  • Why do we need it, say, in Haskell or Scala?
  • What we can do if we have it, and what we can't do if we don't have it?

We should start with the Haskell wiki article Algebraic Data Types

And here, shortly, just my vision:

  • we need them to model business logic in old object oriented way (or actually in old categories-based way), and make it more typesafe as compiler can check that you've matched all possible choices. Or, in other words, that your function is total , not partial. Eventually, it gives the compiler an ability to proof the correctness of your code (that's why sealed traits are recommended). So, the more different types you have the better - btw, generic programming helps you here as it produces more types.
  • standard features: we can represent type as a "set" of objects, we can match object with type (using pattern matching), or even deconstruct (analyze) it with matchers. We can also dynamically add behavior (in compile-time) to such type with type classess. It's possible for regular class as well, but here it gives us an ability to separate algebraic model (types) from behavior (functions)
  • we can construct types as products / coproducts of different objects/types. You can think of algebraic type system as a set (or more generally - as a cartesian-closed category). type Boolean = True | False type Boolean = True | False means that Boolean is a union (coproduct) of True and False . Cat(height: Height, weight: Weight) is a "tuple" (more generally - product) of Height and Weight . Product (more-less) represents "part of" relationship from OOP, coproduct - "is" (but in the opposite way).

It also gives us a way to dispatch behaviour in runtime in multimethod-style (like it was in CLOS ):

  sealed trait Animal
  case class Cat extends Animal
  case class Dog extends Animal

 def move(c: Animal) = c match {
   case Cat(...) => ...
   case Dog(...) =>
   case a: Animal => ...//if you need default implementation
 }

Haskell-like:

 data Animal = Dog | Cat //coproduct

 let move Dog d = ...
 let move Cat c = ...

Instead of:

trait Animal {
  ...
  def move = ...
}

class Cat(val ...) extends Animal {
  override def move = ...
}

class Dog(val ...) extends Animal {
  override def move = ...
}

PS Theoretically, if you're modeling the world in algebraic way and your functions are total and pure - you can proof that your application is correct. If it compiles - it works :).

PS2. I should also mention that Scala's type hierarchy which has Any in it isn't so good for typesafety (but good for interop with Java and IO) as it breaks the nice structure defined by GADTs. More than that case class might be both GADT (algebraic) and ADT (abstract) as well, which also reduces the guarantees.

The blog post you mention is more about the mathematical aspects of algebraic data types that about their practical use in programming. I think most functional programmers first learnt about algebraic data types by using them in some programming language, and only later studied their algebraic properties.

Indeed, the intent of the blog post is clear from its very beginning:

In this series of posts I'll explain why Haskell's data types are called algebraic - without mentioning category theory or advanced math.

Anyway, the practical usefulness of algebraic types is best appreciated by playing around with them.

Suppose, for instance, you want to write a function to intersect two segments on a plane.

def intersect(s1: Segment, s2: Segment): ???

What should the result be? It's tempting to write

def intersect(s1: Segment, s2: Segment): Point

but what if there's no intersection? One might attempt to patch that corner case by returning null , or by throwing a NoIntersection exception. However, two segments might also overlap on more than one point, when they lie on the same straight line. What should we do in such cases? Throwing another exception?

The algebraic types approach is to design a type covering all the cases:

sealed trait Intersection
case object NoIntersection extends Intersection
case class SinglePoint(p: Point) extends Intersection
case class SegmentPortion(s: Segment) extends Intersection

def intersect(s1: Segment, s2: Segment): Intersection

There are many practical cases where such approach feels quite natural. In some other languages lacking algebraic types, one has to resort to exceptions, to null s (also see the billion-dollar mistake ), to non-sealed classes (making it impossible for the compiler to check for exhaustiveness of pattern matching), or to other features provided by the language. Arguably, the "best" option in OOP is to use the Visitor pattern to encode algebraic types and pattern matching in languages which have no such features. Still, having that directly supported in the language, as in scala, is much more convenient.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM