Haskell中的运行长度编码

Question

import Data.List

data Encoding = Multiple Int Char | Single Char deriving (Eq,Show,Ord)

运行长度的编码

encode :: String -> [Encoding]
encode inputString =encoding (group inputString) []


encoding :: [String] -> [Encoding] -> [Encoding]
encoding groupString xs=
if (length groupString == 0)
    then xs
else
    case (head groupString) of
            ([c]) ->encoding (tail groupString)  (xs ++ [Single c])
            (x) -> encoding (tail groupString)  (xs ++ [Multiple (length x) (head x)])

运行长度的解码

decode :: [Encoding] -> String
decode listString = decoding listString []              

decoding :: [Encoding] -> String -> String
decoding inputString xs=
if (length inputString == 0)
    then xs
else
    case (head inputString) of
        (Single x) ->decoding (tail inputString) (xs++ [x])
        (Multiple num x) ->decoding (tail inputString) (xs ++ (replicate num x) )

这是我的运行长度编码解决方案，任何人都可以建议我更好的方式来编写编码和解码功能

Answer 1

您的许多代码专门用于更新累加器。 您将元素添加到该累加器的尾部，这将对性能产生巨大影响。

这通常不是很有效的原因是因为Haskell [a]的列表 - 至少在概念上 - 被实现为链表。 结果将两个列表l1和l2与l1 ++ l2连接在一起，将取O（| l1 |）时间（即l1的元素数）。 这意味着如果列表已经包含100个元素，那么在最后添加一个额外元素将需要大量工作。

另一种反模式是使用head和tail 。 是的，可以使用这些函数，但不幸的是，由于您不使用模式匹配，可能会发生传递空列表，然后head和tail将出错。

这里的另一个问题是你在列表上使用length 。 因为有时Haskell中的列表可以具有无限长度，所以length调用将 - 如果我们需要它 - 永远不会结束。

如果你必须在累加器的末尾附加，通常我们可以在我们正在构建的列表“cons”的尾部写入递归。 所以我们可以改写我们的程序：

encode :: String -> [Encoding]
encode [] = []
encode (h:t)  = ...

现在的问题是我们如何计算这些价值观。 我们可以使用span :: (a -> Bool) -> [a] -> ([a],[a]) ，这个函数将 - 对于给定的谓词和列表 - 构造一个2元组，其中第一个item包含列表的前缀，其中所有元素都满足谓词，第二项包含列表的其余部分，因此我们可以使用以下内容：

encode :: String -> [Encoding]
encode [] = []
encode (h:t)  | nh > 1 = Multiple nh h : tl
              | otherwise = Single h : tl
    where (s1, s2) = span (h ==) t
          nh = 1 + length s1
          tl = encode s2

例如：

Prelude Data.List> encode "Foobaaar   qqquuux"
[Single 'F',Multiple 2 'o',Single 'b',Multiple 3 'a',Single 'r',Multiple 3 ' ',Multiple 3 'q',Multiple 3 'u',Single 'x']

对于解码，我们再次不需要累加器，并且可以使用replicate :: Int -> a -> [a] ：

decode :: [Encoding] -> String
decode [] = []
decode (Single h:t) = h : decode t
decode (Multiple nh h) = replicate nh h ++ decode t

或者我们可以使用列表monad：

decode :: [Encoding] -> String
decode = (>>= f)
    where f (Single h) = [h]
          f (Multiple nh h) = replicate nh h

例如：

Prelude Data.List> decode [Single 'F',Multiple 2 'o',Single 'b',Multiple 3 'a',Single 'r',Multiple 3 ' ',Multiple 3 'q',Multiple 3 'u',Single 'x']
"Foobaaar   qqquuux"

Answer 2

作为Willem Van Onsem出色答案的延伸，请考虑单独创建运行长度，然后将它们与带有zipWith的字母组合在一起。

Data.List具有函数group （它本身是泛型groupBy ; group = groupby (==) ），它将字符串分解为同groupBy串。 即：

group "aaabbccccd"
= ["aaa", "bb", "cccc", "d"]

计算每个的长度将给你游程长度。

请注意，该group的实施方式与Willem的span解决方案完全相同。

import Data.List (group)

data Encoding = Multiple Int Char
              | Single       Char
  deriving (Eq, Show, Ord)

encode :: String -> [Encoding]
encode xs = zipWith op lengths letters
  where
  groups  = group xs
  lengths = map length groups
  letters = map head   groups

  op :: Int -> Char -> Encoding
  op 1 = Single
  op n = Multiple n

这也可以作为一个非常丑陋的列表理解来完成。

encode xs = [ let (n, c) = (length g, head g)
              in  if   n == 1
                  then Single c
                  else Multiple n c
            | g <- group xs ]

Answer 3

您的encoding功能是功能映射。 而不是制作自己的原始递归函数，只需使用map 。
您的编码正在反转输出（ xs ++ [Single c]等），这既反直觉又昂贵。 停下来。
太多括号，例如if (...) then case (..) of ， if (...) then和case of arms (...) -> 。 所有这些都是不必要的，并且使代码混乱。
如果你键入head机会，你应该在某处匹配模式。

考虑：

encoding :: String -> [Encoding]
encoding = map enc . group       -- Point 1, use map which also takes
                                 -- care of point 2 and 3.
 where
 enc [x]      = Single x
 enc xs@(x:_) = Multiple (length xs) x -- Point 4, patterns not `head`
 -- Here consider make pattern matches total either via an error call or Maybe type

Haskell中的运行长度编码

问题描述

3 个解决方案

解决方案1
5 已采纳 2018-02-07 17:35:21

解决方案2
2 2018-02-07 17:52:47

解决方案3
1 2018-02-07 18:43:45

Haskell中的运行长度编码

问题描述

3 个解决方案

解决方案1 5 已采纳 2018-02-07 17:35:21

解决方案2 2 2018-02-07 17:52:47

解决方案3 1 2018-02-07 18:43:45

解决方案1
5 已采纳 2018-02-07 17:35:21

解决方案2
2 2018-02-07 17:52:47

解决方案3
1 2018-02-07 18:43:45