简体   繁体   中英

About Church encoded lists in Haskell

Various optimisation problems, like this one , led to Church encoded lists as a way to enable stream fusion, ie the compiler's elimination of intermediate results (eg lists). Here's the definition that was used successfully in the optimisation problem:

{-# LANGUAGE RankNTypes #-}

-- A list encoded as a strict left fold.
newtype ListL a = ListL {build :: forall b. (b -> a -> b) -> b -> b}

Here's how I look at Church-somethings: Instead of asking what an 'something' is , ask what it can do for you. In the case of lists the answer is: Lists can be folded over. In order to fold, I need an 'update' function of type b->a->b and a starting value of type b . Then I will give you back the result of the fold, which is of type b . Hence the definition of ListL . Here are a few basic operations on ListL :

mapL :: (a -> a') -> ListL a -> ListL a'
mapL f l = ListL (\f' b' -> build l (\b a -> f' b (f a)) b')

instance Functor ListL where fmap = mapL

fromList :: [a] -> ListL a
fromList l = ListL (\c z -> foldl' c z l)

toList :: ListL a -> [a]
toList l = build l snoc [] where snoc xs x = xs ++ [x]

nullL :: ListL a -> Bool
nullL l = build l (\_ _->False) True

Here's more:

filterL :: (a->Bool) -> ListL a -> ListL a
filterL p l = ListL (\f b->build l (\b' a'->if p a' then f b' a' else b') b)

iterUntil :: (a->Bool) -> a -> (a->a) -> ListL a
iterUntil p a f = ListL (\g b-> snd $ until (p.fst) (\(a',b')->(f a', g b' a')) (a,b))

iterUntil iterates a function a->a , starting with some value of type a , until the predicate a->bool is fulfilled. A function like Prelude's iterate isn't possible - at least I don't know how to define it, it would have to be some kind of recursion.

Continuing with examples, length and sum are just exercises in choosing the right 'update' function and starting value in a foldl :

lengthL :: ListL a -> Int
lengthL l = build l (\b _ -> b+1) 0

sumL :: Num a => ListL a -> a
sumL l = build l (+) 0

Now, let's try headL :

headL :: ListL a -> a
headL l = build l (\_ a->a) _   -- this does not compile!

No matter what starting b is provided, the first a should be returned. build l needs a b , but we don't have one. This is a weird one: Basically we want to tell the compiler: You don't need the b , trust me... A headL' :: ListL a -> ListL a , on the other hand, is easy to construct. An error "empty list!" in place of the hole _ doesn't work because it always gets called - laziness doesn't seem to take care of this. So, with headL I'm stuck. Therefore here is

Question 1: How is headL implemented?

The second issue appears when trying to implement the equivalent of repeatM :: Monad m => ma -> m [a] . As with iterUntil , predicate a->Bool is needed to stop the iteration:

iterUntilM :: Monad m => (a->Bool) -> m a -> m (ListL a)

The purpose is clear: Repeat a monadic action ma until a->Bool is satisfied. The idea is, of course, to fold this ListL a right away and achieve stream fusion (list fusion). For example:

import System.Random (randomIO)

main :: IO ()
main = do
     rs <- iterUntilM (>42::Int) randomIO
     print $ lengthL rs

The example is rather contrived, it prints the number of draws it took until the first number >42 was found. In a more realistic setting the monad m is, for example, an ST s monad that wraps some FFI. The point is: This should run efficiently. I'm thoroughly stuck with this one. How do I entangle the (>>=) :: ma -> (a->mb) -> mb with build to get a m (ListL a) ? Ie this is

Question 2: How is iterUntilM implemented?

Other than being a good learning exercise, is this actually a good idea?

In general, when you remove assumptions about a type, the function you write will not only be more general (in terms of what types it can use), it will also be more specific about what exactly it's doing. That's what's happening with the church-encoding allowing fusion: when lists are represented as

data [a] = [] | a : [a]

There are countless ways to use those in a function, only one of them being foldr . However, when you have:

newtype List a = { runList :: forall b. (a -> b -> b) -> b -> b }

The only way to use that type is through foldr . This is what lets you do the optimisations we know and love. Stream fusion is just one of them, by the way: you also get O(1) append, for instance.

Your type is more constrained still: it tells us that the underlying list can't be (meaningfully) infinite.

There's another constrained representation of lists that shifts the focus:

data List a = forall b. List b (b -> Maybe (a, b))

Where the church-encoded list is a consumer, this is a producer . It says nothing about how the list can be consumed, but an awful lot about how it can be made.

So we've seen that we get a lot from these constrained representations, what do we lose? tail is a good example. For the producer:

tail (List x f) = case f x of
  Just (_,xs) -> List xs f

And for the consumer:

tail xs =
    List (\c n ->
        runList xs 
            (\h t g -> g h (t c)) 
            (const n) 
            (const id))

The consumer's implementation is O(n) , whereas the producer's is obviously O(1) .

Both of these types can allow for fusion, but some functions can be more efficiently implemented in one than the other. GHC happened to choose the former representation as its basis for fusion, but there's nothing fundamental that makes that choice superior: most functions Haskellers were using just seemed to work better in the foldr/build pattern of fusion than the other one. In other places , the unfolding pattern is used.

That preamble out of the way, there are two questions we need to ask:

  • Do these functions ( head and iterUntilM ) work efficiently in only the foldr representation (like append), or in the unfoldr representation (like tail ), or both (like map )?
  • Is the strict left-fold encoding the right choice for these? Is it too constrained (ie do we need foldr ?), or can it be constrained even more?

head can be implemented on foldr -encoded lists pretty easily:

head xs = runList xs const (error "head: empty list")

On foldl' -lists, it's a little more complicated:

head xs =
    fromMaybe
        (error "head: empty list")
        (build xs (\xs x -> maybe (Just x) Just xs) Nothing)

You'll notice that this function (like tail on foldr -lists) is O(n) . It also doesn't work for infinite lists. This is a good indication that foldl' isn't the right choice for fusing head .

Now, for iterUntilM , we see a case where (I don't think) even fusion is possible. Because the m ends up on the outside, you have to run all of the effects in the list (materialising it).

For a good overview of this area, check out this blog post.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM