[英]Haskell optimization of a function looking for a bytestring terminator
Profiling of some code showed that about 65% of the time I was inside the following code. 对某些代码进行性能分析表明,大约有65%的时间在以下代码中。
What it does is use the Data.Binary.Get monad to walk through a bytestring looking for the terminator. 它的作用是使用Data.Binary.Get monad遍历字节串以查找终止符。 If it detects 0xff, it checks if the next byte is 0x00.
如果检测到0xff,则检查下一个字节是否为0x00。 If it is, it drops the 0x00 and continues.
如果是,它将丢弃0x00并继续。 If it is not 0x00, then it drops both bytes and the resulting list of bytes is converted to a bytestring and returned.
如果它不是0x00,则它将丢弃两个字节,并且结果字节列表将转换为字节串并返回。
Any obvious ways to optimize this? 有什么明显的方法可以优化吗? I can't see it.
我看不到
parseECS = f [] False
where
f acc ff = do
b <- getWord8
if ff
then if b == 0x00
then f (0xff:acc) False
else return $ L.pack (reverse acc)
else if b == 0xff
then f acc True
else f (b:acc) False
It seems there may be a bug here. 似乎这里可能有错误。 An exception gets raised if you reach the end of the byte stream before an 0xff, not 0x00 sequence is found.
如果在找到0xff而不是0x00序列之前到达字节流的末尾,则会引发异常。 Here's a modified version of your function:
这是功能的修改版本:
parseECS :: Get L.ByteString
parseECS = f [] False
where
f acc ff = do
noMore <- isEmpty
if noMore
then return $ L.pack (reverse acc)
else do
b <- getWord8
if ff
then
if b == 0x00
then f (0xff:acc) False
else return $ L.pack (reverse acc)
else
if b == 0xff
then f acc True
else f (b:acc) False
I haven't done any profiling, but this function will probably be faster. 我没有进行任何分析,但是此功能可能会更快。 Reversing long lists is expensive.
反转长列表非常昂贵。 I'm not sure how lazy
getRemainingLazyByteString
is. 我不确定
getRemainingLazyByteString
有多懒。 If it's too strict this probably won't work for you. 如果太严格,这可能对您不起作用。
parseECS2 :: Get L.ByteString
parseECS2 = do
wx <- liftM L.unpack $ getRemainingLazyByteString
return . L.pack . go $ wx
where
go [] = []
go (0xff:0x00:wx) = 0xff : go wx
go (0xff:_) = []
go (w:wx) = w : go wx
If problem is in "reverse" you can use "lookAhead" to scan position and then go back and rebuild your new string 如果问题是“反向”,则可以使用“ lookAhead”扫描位置,然后返回并重新构建新字符串
parseECS2 :: Get L.ByteString
parseECS2 = do
let nextWord8 = do
noMore <- isEmpty
if noMore then return Nothing
else liftM Just getWord8
let scanChunk !n = do
b <- nextWord8
case b of
Just 0xff -> return (Right (n+1))
Just _ -> scanChunk (n+1)
Nothing -> return (Left n)
let readChunks = do
c <- lookAhead (scanChunk 0)
case c of
Left n -> getLazyByteString n >>= \blk -> return [blk]
Right n -> do
blk <- getLazyByteString n
b <- lookAhead nextWord8
case b of
Just 0x00 -> skip 1 >> liftM (blk:) readChunks
_ -> return [L.init blk]
liftM (foldr L.append L.empty) readChunks
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.