简体   繁体   中英

How can I reduce this tree in parallel in Haskell?

I have a simple tree which stores a sequence of values in its leaves and some simple functions to facilitate testing.

If I have an unbounded number of processors and the tree is balanced, I should be able reduce the tree using any binary associative operation (+, *, min, lcm) in logarithmic time.

By making Tree an instance of Foldable, I can reduce the tree sequentially from left to right or right to left using built-in functions, but this takes linear time.

How can I use Haskell to reduce such a tree in parallel?

{-# LANGUAGE DeriveFoldable #-}

data Tree a = Leaf a | Node (Tree a) (Tree a)
            deriving (Show, Foldable)

toList :: Tree a -> [a]
toList = foldr (:) []

range :: Int -> Int -> Tree Int
range x y
  | x < y     = Node (range x y') (range x' y)
  | otherwise = Leaf x
  where
    y' = quot (x + y) 2
    x' = y' + 1

The naive fold is written this way:

cata fLeaf fNode = go where
    go (Leaf z) = fLeaf z
    go (Node l r) = fNode (go l) (go r)

I suppose the parallel one would be pretty simply adapted:

parCata fLeaf fNode = go where
    go (Leaf z) = fLeaf z
    go (Node l r) = gol `par` gor `pseq` fNode gol gor where
        gol = go l
        gor = go r

But could even be written in terms of cata :

parCata fLeaf fNode = cata fLeaf (\l r -> l `par` r `pseq` fNode l r)

Update

I originally answered the question under the assumption that the reduction operation was not expensive. Here's an answer which performs an associative reduction in chunks of n elements.

That is, suppose op is an associative binary operation and you want to compute foldr1 op [1..6] , here's code which will evaluate it as:

(op (op 1 2) (op 3 4)) (op 5 6)

which allows for parallel evaluation.

import Control.Parallel.Strategies
import System.TimeIt
import Data.List.Split
import Debug.Trace

recChunk :: ([a] -> a) -> Int -> [a] -> a
recChunk op n xs =
  case chunksOf n xs of
    [a] -> op a
    cs  -> recChunk op n $ parMap rseq op cs

data N = N Int | Op [N]
  deriving (Show)

test1 = recChunk Op 2 $ map N [1..10]
test2 = recChunk Op 3 $ map N [1..10]

fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)

fib' n | trace msg False = undefined
  where msg = "fib called with " ++ show n
fib' n = fib n

sumFib :: [Int] -> Int
sumFib xs | trace msg False = undefined
  where msg = "sumFib: " ++ show xs
sumFib xs = seq s (s + (mod (fib' (40 + mod s 2)) 1))
  where s = sum xs

main = do
  timeIt $ print $ recChunk sumFib 2 [1..20]

Original Answer

Since you have an associative operation, you can just use your toList function and evaluate the list in parallel with parMap or parList .

Here is some demo code which adds up the fib of each Leaf. I use parBuffer to avoid creating too many sparks - this is not needed if your tree is smallish.

I'm loading a tree from a file because it seemed that GHC with -O2 was detecting common sub-expressions in my test tree.

Also, adjust rseq to your needs - you may need rdeepseq depending on what you are accumulating.

{-# LANGUAGE DeriveFoldable #-}

import Control.Parallel.Strategies
import System.Environment
import Control.DeepSeq
import System.TimeIt
import Debug.Trace

fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)

fib' n | trace msg False = undefined
  where msg = "fib called with " ++ show n
fib' n = fib n

data Tree a = Leaf a | Node (Tree a) (Tree a)
            deriving (Show, Read, Foldable)

toList :: Tree a -> [a]
toList = foldr (:) []

computeSum :: Int -> Tree Int -> Int
computeSum k t = sum $ runEval $ parBuffer k rseq $ map fib' $ toList t

main = do
  tree <- fmap read $ readFile "tree.in"
  timeIt $ print $ computeSum 4 tree
  return ()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM