Suppose I have a function f
which takes an integer argument. f
may not terminate on some arguments, but its result is equally valuable. (For concreteness, the argument could be the seed to a random number generator, which is passed to a SAT solver.)
I want to use concurrency and invoke f 1
, f 2
, f 3
, etc., and return when the first one finishes. So, each thread should be running code that looks like
comp <- start_proc (f 1)
wait(comp || anyDone) -- wait for _either_ of these signals to be true
if comp then
set anyDone = True
What's the easiest way to do this? The AMB operator comes to mind, but I'd need to run all processes simultaneously (eg on a 24- or 80-core machine). (Distributed computing solutions would be even better.) A superficial look at the AMB wiki page suggests it may not support non-terminating processes?
Currently, I'm not getting the answers to work with what I want. I think this is probably more of an issue with how I'm creating processes than anything else.
Define
runProc (x:xs) =
createProcess (proc x xs) >>= \(_, _, _, h) -> waitForProcess h
Then, I want to race runProc ["zsh", "-c", "sleep 3"]
and runProc ["ls"]
. I modified Thomas' answer a little, but it didn't work.
raceL :: [IO α] -> IO α
raceL ops = do
mv <- newEmptyMVar
tids <- forM ops (\op -> forkIO (op >>= putMVar mv))
answer <- takeMVar mv
mapM_ killThread tids
return answer
Compiling with -threaded
and running with +RTS -N
(I have a 4-core machine) doesn't seem to help.
Why not just an MVar
and forkIO
?
import Control.Concurrent
import Control.Concurrent.MVar
import System.Environment
import Control.Monad
main = do
mv <- newEmptyMVar
[nrThreads] <- liftM (map read) getArgs
tids <- replicateM nrThreads (forkIO $ operation mv)
answer <- takeMVar mv
mapM_ killThread tids
operation :: MVar Int -> IO ()
operation mv = putMVar mv 5
This will fork nrThreads
light weight threads. Once one thread has finished it should place the answer in the provided MVar. All other threads will then be killed by the main thread. No explicit polling is needed as the GHC RTS will reschedule main
once the MVar
becomes non-empty.
Instead of amb, consider unamb ! It provides a handful of nice primitives for racing computations, both pure and impure. For example:
Prelude Data.Unamb> unamb (last [1..]) 32
32
Prelude Data.Unamb> race (threadDelay 5000000 >> return 3) readLn
Prelude Data.Unamb Control.Concurrent> race (threadDelay 5000000 >> return 3) readLn
56
56
Prelude Data.Unamb Control.Concurrent> race (threadDelay 5000000 >> return 3) readLn
3
One option would be to use STM to detect termination, then explicitly kill all other threads. We can define:
start_proc :: IO a -> IO (ThreadId, TVar (Maybe a))
start_proc job = do
resultVar <- newTVarIO Nothing
forkIO $ job >>= (atomically . writeTVar resultVar)
return resultVar
Then do:
any_parallel :: [IO a] -> IO a
any_parallel jobs = do
(threads, vars) <- liftM unzip $ mapM start_proc jobs
result <- atomically $ foldl orElse retry (map check_job vars)
mapM_ killThread threads
return result
where
check_job :: TVar (Maybe a) -> STM a
check_job resultVar = do
val <- readTVar resultVar
case val of
Nothing -> retry
Just x -> return x
The key thing here is, the first time run_multiple goes through its set of result variables, they're all Nothing
, and so it retry
s. The STM monad records which TVar
s it looked at, and whenever any of them is written, the STM transaction is re-executed. At this point, it sees one of the TVar
s is not Nothing
, and can take the result at that point.
Once we have a result, of course, we simply terminate all threads. This is likely to be faster than having them check in their inner loop for some shared flag; there's less contention on a shared MVar (or what-have-you).
Note that killThread
waits for the target thread to reach a 'safe point' (ie, memory allocation) before killing the thread. This cannot be guaranteed to occur if the target thread has a tight inner loop that does not perform any memory allocation. You may want to make sure the threads periodically perform an IO
action that forces allocation to occur.
Another way to do it is to manually schedule your code by using a monad to model steps of computation .
This can let you manually switch between different computation threads, stepping each a few at a time, until one of them finishes:
sum5 :: [ Computation (Int, Int) ]
sum5 = [ sum5' x 0 | x <- [ 0, 1.. ] ]
where sum5' x y = if x + y == 5
then return (x,y)
else do
y' <- return (y+1)
sum5' x y'
prod6 :: [ Computation (Int, Int) ]
prod6 = [ prod6' x 0 | x <- [ 0, 1.. ] ]
where prod6' x y = if x * y == 6
then return (x,y)
else do
y' <- return (y+1)
prod6' x y'
firstSolution :: [Computation a] -> Strategy a -> a
firstSolution cs s = head . toList . runComputation $ s cs
Then you can see how allow you to interleave computations (even non terminating ones)
ghci> firstSolution sum5 fair
(5,0)
ghci> firstSolution sum5 diagu
(0,5)
ghci> firstSolution sum5 diagd
(5,0)
ghci> firstSolution prod6 fair
^CInterrupted.
ghci> firstSolution prod6 diagu
(2,3)
ghci> firstSolution prod6 diagd
(3,2)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.