I have the next monad transformer:
newtype Pdf' m a = Pdf' {
unPdf' :: StateT St (Iteratee ByteString m) a
}
type Pdf m = ErrorT String (Pdf' m)
Basically, it uses underlying Iteratee
that reads and processes pdf document (requires random-access source, so that it will not keep the document in memory all the time).
I need to implement a function that will save pdf document, and I want it to be lazy, it should be possible to save document in constant memory.
I can produce lazy ByteString
:
import Data.ByteString.Lazy (ByteString)
import qualified Data.ByteString.Lazy as BS
save :: Monad m => Pdf m ByteString
save = do
-- actually it is a loop
str1 <- serializeTheFirstObject
storeOffsetForTheFirstObject (BS.length str1)
str2 <- serializeTheSecondObject
storeOffsetForTheSecondObject (BS.length str2)
...
strn <- serializeTheNthObject
storeOffsetForTheNthObject (BS.length strn)
table <- dumpRefTable
return mconcat [str1, str2, ..., strn] `mappend` table
But actual output can depend on previous output. (Details: pdf document contains so called "reference table" with absolute offset in bytes of every object inside the document. It definitely depends on length of ByteString
pdf object is serialized to.)
How to ensure that save
function will not force entire ByteString
before returning it to caller?
Is it better to take callback as an argument and call it every time I have something to output?
import Data.ByteString (ByteString)
save :: Monad m => (ByteString -> Pdf m ()) -> Pdf m ()
Is there better solution?
To build this in one pass you will need to store (perhaps in the state) where your indirect objects have been written. So the save needs to keep track of the absolute byte position as it works -- I have not considered whether your Pdf monad is suitable for this task. When you get to the end you can used the addresses stored in the state to create the xref section.
I do not think a two-pass algorithm will help.
Edit June 6th: Perhaps I understand your desire better now. For very fast generation of documents, eg HTML, there are several libraries on hackage with "blaze" in the name. The technique is to avoid using 'mconcat' on the ByteString and use in on an intermediate 'builder' type. The core library for this seems to be 'blaze-builder' , which is used in 'blaze-html' and 'blaze-textual'.
The solution I found so far is Coroutine Example:
proc :: Int -> Coroutine (Yield String) IO ()
proc 0 = return ()
proc i = do
suspend $ Yield "Hello World\n" (proc $ i - 1)
main :: IO ()
main = do
go (proc 10)
where
go cr = do
r <- resume cr
case r of
Right () -> return ()
Left (Yield str cont) -> do
putStr str
go cont
It does the same work as callback, but caller has full control on output generation.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.