簡體   English   中英

在Haskell中為左關聯樹實現`read`

[英]Implementing `read` for a left-associative tree in Haskell

我很難為樹結構實現Read 我想采用像ABC(DE)F這樣的左關聯字符串(帶有parens)並將其轉換為樹。 該特定示例對應於樹

樹

這是我正在使用的數據類型(雖然我願意接受建議):

data Tree = Branch Tree Tree | Leaf Char deriving (Eq)

那個特定的樹將在Haskell中:

example = Branch (Branch (Branch (Branch (Leaf 'A')
                                         (Leaf 'B'))
                                 (Leaf 'C'))
                         (Branch (Leaf 'D')
                                 (Leaf 'E')))
                 (Leaf 'F')

我的show函數看起來像:

instance Show Tree where
    show (Branch l r@(Branch _ _)) = show l ++ "(" ++ show r ++ ")"
    show (Branch l r) = show l ++ show r
    show (Leaf x) = [x]

我想制作一個read功能

read "ABC(DE)F" == example

在這種情況下,使用解析庫會使代碼非常短且極具表現力。 (我很驚訝,這是如此整潔,當我嘗試回答這個!)

我將使用Parsec (該文章提供一些鏈接以獲取更多信息),並在“應用模式”(而不是monadic)中使用它,因為我們不需要monad的額外功率/足部射擊能力。

首先是各種進口和定義:

import Text.Parsec

import Control.Applicative ((<*), (<$>))

data Tree = Branch Tree Tree | Leaf Char deriving (Eq, Show)

paren, tree, unit :: Parsec String st Tree

現在,樹的基本單元是單個字符(不是括號)或帶括號的樹。 帶括號的樹只是()之間的普通樹。 而正常的樹只是左邊相關的分支單元(它非常自我遞歸)。 在Haskell與Parsec:

-- parenthesised tree or `Leaf <character>`
unit = paren <|> (Leaf <$> noneOf "()") <?> "group or literal"

-- normal tree between ( and )
paren = between (char '(') (char ')') tree  

-- all the units connected up left-associatedly
tree = foldl1 Branch <$> many1 unit

-- attempt to parse the whole input (don't short-circuit on the first error)
onlyTree = tree <* eof

(是的,那就是整個解析器!)

如果我們想要,我們可以沒有parenunit但上面的代碼非常具有表現力,所以我們可以保持原樣。

作為簡要說明(我提供了文檔的鏈接):

  • (<|>)基本上是指“左解析器或右解析器”;
  • (<?>)允許您制作更好的錯誤消息;
  • noneOf將解析不在給定字符列表中的任何內容;
  • between需要三個解析器,並且只要它是由所述第一和第二個分隔返回第三個分析器的值;
  • char從字面上解析其論點。
  • many1將一個或多個參數解析為一個列表(似乎空字符串無效,因此many1 ,而不是many解析零或更多);
  • eof匹配輸入的結尾。

我們可以使用parse函數來運行解析器(它返回Either ParseError TreeLeft是一個錯誤, Right是一個正確的解析)。

正如read

使用它作為read功能可能是這樣的:

read' str = case parse onlyTree "" str of
   Right tr -> tr
   Left er -> error (show er)

(我使用read'來避免與Prelude.read發生沖突;如果你想要一個Read實例,你將需要做更多的工作來實現readPrec (或者任何需要的東西)但是它不應該太難了實際解析已經完成。)

例子

一些基本的例子:

*Tree> read' "A"
Leaf 'A'

*Tree> read' "AB"
Branch (Leaf 'A') (Leaf 'B')

*Tree> read' "ABC"
Branch (Branch (Leaf 'A') (Leaf 'B')) (Leaf 'C')

*Tree> read' "A(BC)"
Branch (Leaf 'A') (Branch (Leaf 'B') (Leaf 'C'))

*Tree> read' "ABC(DE)F" == example
True

*Tree> read' "ABC(DEF)" == example
False

*Tree> read' "ABCDEF" == example
False

證明錯誤:

*Tree> read' ""
***Exception: (line 1, column 1):
unexpected end of input
expecting group or literal

*Tree> read' "A(B"
***Exception: (line 1, column 4):
unexpected end of input
expecting group or literal or ")"

最后, treeonlyTree之間的區別:

*Tree> parse tree "" "AB)CD"     -- success: ignores ")CD"
Right (Branch (Leaf 'A') (Leaf 'B'))

*Tree> parse onlyTree "" "AB)CD" -- fail: can't parse the ")"
Left (line 1, column 3):
unexpected ')'
expecting group or literal or end of input

結論

Parsec太神​​奇了! 這個答案可能很長,但它的核心只有5或6行代碼完成所有工作。

這非常像堆棧結構。 當你遇到你的輸入字符串"ABC(DE)F" ,你Leaf你找到(非括號),並把它放在一個蓄能器列表中的任何原子。 如果列表中有2個項目,則將它們Branch在一起。 這可以用類似的東西來完成(注意,未經測試,僅包括給出一個想法):

read' [r,l] str  = read' [Branch l r] str
read' acc (c:cs) 
   -- read the inner parenthesis
   | c == '('  = let (result, rest) = read' [] cs 
                 in read' (result : acc) rest
   -- close parenthesis, return result, should be singleton
   | c == ')'  = (acc, cs) 
   -- otherwise, add a leaf
   | otherwise = read' (Leaf c : acc) cs
read' [result] [] = (result, [])
read' _ _  = error "invalid input"

這可能需要一些修改,但我認為它足以讓你走上正軌。

dbaupp的parsec答案很容易理解。 作為“低級”方法的示例,這里是一個手寫解析器,它使用成功延續來處理左關聯樹構建:

instance Read Tree where readsPrec _prec s = maybeToList (readTree s)

type TreeCont = (Tree,String) -> Maybe (Tree,String)

readTree :: String -> Maybe (Tree,String)
readTree = read'top Just where
  valid ')' = False
  valid '(' = False
  valid _ = True

  read'top :: TreeCont -> String -> Maybe (Tree,String)
  read'top acc s@(x:ys) | valid x =
    case ys of
      [] -> acc (Leaf x,[])
      (y:zs) -> read'branch acc s
  read'top _ _ = Nothing

  -- The next three are mutually recursive

  read'branch :: TreeCont -> String -> Maybe (Tree,String)
  read'branch acc (x:y:zs) | valid x = read'right (combine (Leaf x) >=> acc) y zs
  read'branch _ _ = Nothing

  read'right :: TreeCont -> Char -> String -> Maybe (Tree,String)
  read'right acc y ys | valid y = acc (Leaf y,ys)
  read'right acc '(' ys = read'branch (drop'close >=> acc) ys
     where drop'close (b,')':zs) = Just (b,zs)
           drop'close _ = Nothing
  read'right _ _ _ = Nothing  -- assert y==')' here

  combine :: Tree -> TreeCont
  combine build (t, []) = Just (Branch build t,"")
  combine build (t, ys@(')':_)) = Just (Branch build t,ys)  -- stop when lookahead shows ')'
  combine build (t, y:zs) = read'right (combine (Branch build t)) y zs

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM