根據時間限制管道？

Question

是否可以創建管道以獲取在特定時間段內已向下游發送的所有值？ 我正在實現一個服務器，該協議允許我連接傳出的數據包並將它們壓縮在一起，因此我想每隔100ms有效地“清空”下游ByteString的隊列， mappend它們mappend在一起，然后產生下一個壓縮的管道。

Answer 1

這是使用pipes-concurrency的解決方案。 您給它任何Input ，它將定期消耗所有值的輸入：

import Control.Applicative ((<|>))
import Control.Concurrent (threadDelay)
import Data.Foldable (forM_)
import Pipes
import Pipes.Concurrent

drainAll :: Input a -> STM (Maybe [a])
drainAll i = do
    ma <- recv i
    case ma of
        Nothing -> return Nothing
        Just a  -> loop (a:)
  where
    loop diffAs = do
        ma <- recv i <|> return Nothing
        case ma of
            Nothing -> return (Just (diffAs []))
            Just a  -> loop (diffAs . (a:))

bucketsEvery :: Int -> Input a -> Producer [a] IO ()
bucketsEvery microseconds i = loop
  where
    loop = do
        lift $ threadDelay microseconds
        ma <- lift $ atomically $ drainAll i
        forM_ ma $ \a -> do
            yield a
            loop

通過選擇用於構建Input的Buffer的類型，可以更好地控制從上游消耗元素的方式。

如果您不pipes-concurrency ，則可以閱讀該教程，該教程說明了如何使用spawn ， Buffer和Input 。

Answer 2

這是一個可能的解決方案。 它基於Pipe ，該Pipe使用Bool標記向下游的ByteString ，以便標識屬於同一“時間段”的ByteStrings 。

首先，一些進口：

import Data.AdditiveGroup
import qualified Data.ByteString as B
import qualified Data.ByteString.Lazy as BL
import qualified Data.ByteString.Lazy.Builder as BB
import Data.Thyme.Clock
import Data.Thyme.Clock.POSIX
import Control.Monad.State.Strict
import Control.Lens (view)
import Control.Concurrent (threadDelay)
import Pipes
import Pipes.Lift
import qualified Pipes.Prelude as P
import qualified Pipes.Group as PG

這是標記Pipe 。 它StateT內部使用StateT ：

tagger :: Pipe B.ByteString (B.ByteString,Bool) IO ()
tagger = do
    startTime <- liftIO getPOSIXTime
    evalStateP (startTime,False) $ forever $ do
        b <- await
        currentTime <- liftIO getPOSIXTime
        -- (POSIXTime,Bool) inner state
        (baseTime,tag) <- get
        if (currentTime ^-^ baseTime > timeLimit)
            then let tag' = not tag in
                 yield (b,tag') >> put (currentTime, tag')
            else yield $ (b,tag)
    where
        timeLimit = fromSeconds 0.1

然后，我們可以使用的功能從pipes-group包組ByteString小號屬於相同的“時間桶”到懶惰ByteString S：

batch :: Producer B.ByteString IO () -> Producer BL.ByteString IO ()
batch producer =  PG.folds (<>) mempty BB.toLazyByteString
                . PG.maps (flip for $ yield . BB.byteString . fst)
                . view (PG.groupsBy $ \t1 t2-> snd t1 == snd t2)
                $ producer >-> tagger

它似乎正確批處理。 該程序：

main :: IO ()
main = do
    count <- P.length $ batch (yield "boo" >> yield "baa")
    putStrLn $ show count
    count <- P.length $ batch (yield "boo" >> yield "baa" 
                               >> liftIO (threadDelay 200000) >> yield "ddd")
    putStrLn $ show count

具有輸出：

1
2

請注意，僅當下一個存儲桶的第一個元素到達時，才會yield “時間存儲桶”的內容。 他們不是yield自動編每100毫秒。 這可能對您來說不是問題。 您想每100ms自動yield一次，可能需要基於pipes-concurrency的不同解決方案。

另外，您可以考慮直接使用pipes-group提供的基於FreeT的“效果列表”。 這樣，您可以在存儲桶裝滿之前開始在“時間存儲桶”中壓縮數據。

Answer 3

因此，與Daniel的回答不同，我不會在生成數據時標記數據。 它僅從上游獲取至少一個元素，然后繼續在monoid中聚合更多的值，直到時間間隔過去為止。

該代碼使用列表進行匯總，但可以使用更好的monoid進行匯總

import Pipes
import qualified Pipes.Prelude as P

import Data.Time.Clock
import Data.Time.Calendar

import Data.Time.Format

import Data.Monoid

import Control.Monad

-- taken from pipes-rt
doubleToNomDiffTime :: Double -> NominalDiffTime
doubleToNomDiffTime x =
  let d0 = ModifiedJulianDay 0
      t0 = UTCTime d0 (picosecondsToDiffTime 0)
      t1 = UTCTime d0 (picosecondsToDiffTime $ floor (x/1e-12))
  in  diffUTCTime t1 t0

-- Adapted from from pipes-parse-1.0 
wrap
  :: Monad m =>
     Producer a m r -> Producer (Maybe a) m r
wrap p = do
  p >-> P.map Just
  forever $ yield Nothing
yieldAggregateOverTime
  :: (Monoid y,  -- monoid dependance so we can do aggregation
      MonadIO m  -- to beable to get the current time the
                 -- base monad must have access to IO
     ) =>
     (t -> y) -- Change element from upstream to monoid
  -> Double -- Time in seconds to aggregate over
  -> Pipe (Maybe t) y m ()
yieldAggregateOverTime wrap period = do
  t0 <- liftIO getCurrentTime
  loop mempty (dtUTC `addUTCTime` t0)
  where
    dtUTC = doubleToNomDiffTime period
    loop m ts = do
      t <- liftIO getCurrentTime
      v0 <- await -- await at least one element
      case v0 of
        Nothing -> yield m
        Just v -> do
          if t > ts
          then do
            yield (m <> wrap v)
            loop mempty (dtUTC `addUTCTime` ts)
          else do
            loop (m <> wrap v) ts


main = do
  runEffect $  wrap (each [1..]) >-> yieldAggregateOverTime (\x -> [x]) (0.0001)
                            >-> P.take 10 >-> P.print

根據您的CPU負載，輸出數據的聚合方式會有所不同。 每個塊中至少有一個元素。

$ ghc Main.hs -O2
$ ./Main
[1,2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]

$ ./Main
[1,2]
[3]
[4]
[5]
[6,7,8,9,10]
[11,12,13,14,15,16,17,18]
[19,20,21,22,23,24,25,26]
[27,28,29,30,31,32,33,34]
[35,36,37,38,39,40,41,42]
[43,44,45,46,47,48,49,50]

$ ./Main
[1,2,3,4,5,6]
[7]
[8]
[9,10,11,12,13,14,15,16,17,18,19,20]
[21,22,23,24,25,26,27,28,29,30,31,32,33]
[34,35,36,37,38,39,40,41,42,43,44]
[45,46,47,48,49,50,51,52,53,54,55]
[56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72]
[73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88]
[89,90,91,92,93,94,95,96,97,98,99,100,101,102,103]

$ ./Main
[1,2,3,4,5,6,7]
[8]
[9]
[10,11,12,13,14,15,16,17,18]
[19,20,21,22,23,24,25,26,27]
[28,29,30,31,32,33,34,35,36,37]
[38,39,40,41,42,43,44,45,46]
[47,48,49,50]
[51,52,53,54,55,56,57]
[58,59,60,61,62,63,64,65,66]

您可能想看一下pipes-rt的源代碼，它顯示了一種處理管道時間的方法。

編輯：感謝DanielDíazCarrete，他采用了pipes-parse-1.0技術來處理上游終端。 管道組解決方案也應該使用相同的技術。

根據時間限制管道？

問題描述

3 個解決方案

解決方案1
3 已采納 2014-03-09 19:03:44

解決方案2
1 2014-03-09 11:02:59

解決方案3
0 2014-03-09 11:29:03

根據時間限制管道？

問題描述

3 個解決方案

解決方案1 3 已采納 2014-03-09 19:03:44

解決方案2 1 2014-03-09 11:02:59

解決方案3 0 2014-03-09 11:29:03

解決方案1
3 已采納 2014-03-09 19:03:44

解決方案2
1 2014-03-09 11:02:59

解決方案3
0 2014-03-09 11:29:03