简体   繁体   中英

A real life example when pattern matching is more preferable than a case expression in Haskell?

So I have been busy with the Real World Haskell book and I did the lastButOne exercise. I came up with 2 solutions, one with pattern matching

lastButOne :: [a] -> a
lastButOne ([]) = error "Empty List"
lastButOne (x:[]) = error "Only one element"
lastButOne (x:[x2]) = x
lastButOne (x:xs) = lastButOne xs

And one using a case expression

lastButOneCase :: [a] -> a
lastButOneCase x =
  case x of
    [] ->  error "Empty List"
    (x:[]) ->  error "Only One Element"
    (x:[x2]) ->  x
    (x:xs) ->  lastButOneCase xs

What I wanted to find out is when would pattern matching be preferred over case expressions and vice versa . This example was not good enough for me because it seems that while both of the functions work as intended, it did not lead me to choose one implementation over the other. So the choice "seems" preferential at first glance?

So are there good cases by means of source code, either in haskell's own source or github or somewhere else, where one is able to see when either method is preferred or not?

First a short terminology diversion: I would call both of these "pattern matching". I'm not sure there is a good term for distinguishing pattern-matching-via-case and pattern-matching-via-multiple-definition.

The technical distinction between the two is quite light indeed. You can verify this yourself by asking GHC to dump the core it generates for the two functions, using the -ddump-simpl flag. I tried this at a few different optimization levels, and in all cases the only differences in the Core were naming. (By the way, if anyone knows a good "semantic diff" program for Core -- which knows about at the very least alpha equivalence -- I'm very interested in hearing about it!)

There are a few small gotchas to watch out for, though. You might wonder whether the following is also equivalent:

{-# LANGUAGE LambdaCase #-}
lastButOne = \case
  [] ->  error "Empty List"
  (x:[]) ->  error "Only One Element"
  (x:[x2]) ->  x
  (x:xs) ->  lastButOneCase xs

In this case, the answer is yes. But consider this similar-looking one:

-- ambiguous type error
sort = \case
  [] -> []
  x:xs -> insert x (sort xs)

All of a sudden this is a typeclass-polymorphic CAF, and so on old GHCs this will trigger the monomorphism restriction and cause an error, whereas the superficially identical version with an explicit argument does not:

-- this is fine!
sort [] = []
sort (x:xs) = insert x (sort xs)

The other minor difference (which I forgot about -- thank you to Thomas DuBuisson for reminding me) is in the handling of where clauses. Since where clauses are attached to binding sites, they cannot be shared across multiple equations but can be shared across multiple cases. For example:

-- error; the where clause attaches to the second equation, so
-- empty is not in scope in the first equation
null [] = empty
null (x:xs) = nonempty
  where empty = True
        nonempty = False

-- ok; the where clause attaches to the equation, so both empty
-- and nonempty are in scope for the entire case expression
null x = case x of
  [] -> empty
  x:xs -> nonempty
  where
  empty = True
  nonempty = False

You might think this means you can do something with equations that you can't do with case expressions, namely, have different meanings for the same name in the two equations, like this:

null [] = answer where answer = True
null (x:xs) = answer where answer = False

However, since the patterns of case expressions are binding sites, this can be emulated in case expressions as well:

null x = case x of
  [] -> answer where answer = True
  x:xs -> answer where answer = False

Whether the where clause is attached to the case 's pattern or to the equation depends on indentation, of course.

If I recall correctly both these will "desugar" into the same core code in ghc, so the choice is purely stylistic. Personally I would go for the first one. As someone said, its shorter, and what you term "pattern matching" is intended to be used this way. (Actually the second version is also pattern matching, just using a different syntax for it).

It's a stylistic preference. Some people sometimes argue that one choice or another makes certain code changes take less effort, but I generally find such arguments, even when accurate, don't actually amount to a big improvement. So do as you like.

A perspective that's well worth bringing into this is Hudak, Hughes, Peyton Jones and Wadler's paper "A History of Haskell: Being Lazy With Class" . Section 4.4 is about this topic. The short story: Haskell supports both because the designers couldn't agree on one over the other. Yep, again, it's a stylistic preference.

When you're matching on more than one expression, case expressions start to look more attractive.

f pat11 pat21 = ...
f pat11 pat22 = ...
f pat11 pat23 = ...
f pat12 pat24 = ...
f pat12 pat25 = ...

can be more annoying to write than

f pat11 y =
  case y of
    pat21 -> ...
    pat22 -> ...
    pat23 -> ...
f pat12 y =
  case y of
    pat24 -> ...
    pat25 -> ...

More significantly, I've found that when using GADTs, the "declaration style" doesn't seem to propagate evidence from left to right the way I'd expect it to. There might be some trick I haven't worked out, but I end up having to nest case expressions to avoid spurious incomplete pattern warnings.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM