Haskell讀取變量名

Question

我需要編寫解析某種語言的代碼。 我被困在解析變量名上-它可以是至少1個字符長的任何內容，以小寫字母開頭，並且可以包含下划線'_'字符。 我認為我從以下代碼入手：

identToken :: Parser String
identToken = do 
                       c <- letter
                       cs <- letdigs
                       return (c:cs)
             where letter = satisfy isLetter
                   letdigs = munch isLetter +++ munch isDigit +++ munch underscore
                   num = satisfy isDigit
                   underscore = \x -> x == '_'
                   lowerCase = \x -> x `elem` ['a'..'z'] -- how to add this function to current code?

ident :: Parser Ident
ident = do 
          _ <- skipSpaces
          s <- identToken
          skipSpaces; return $ s

idents :: Parser Command
idents = do 
          skipSpaces; ids <- many1 ident
          ...

但是，此功能給我一個奇怪的結果。 如果我調用測試函數

test_parseIdents :: String -> Either Error [Ident]
test_parseIdents p = 
  case readP_to_S prog p of
    [(j, "")] -> Right j
    [] -> Left InvalidParse
    multipleRes -> Left (AmbiguousIdents multipleRes)
  where
    prog :: Parser [Ident]
    prog = do
      result <- many ident
      eof
      return result

像這樣：

test_parseIdents  "test"

我得到這個：

Left (AmbiguousIdents [(["test"],""),(["t","est"],""),(["t","e","st"],""),
    (["t","e","st"],""),(["t","est"],""),(["t","e","st"],""),(["t","e","st"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],""),
    (["t","e","s","t"],""),(["t","e","s","t"],""),(["t","e","s","t"],"")])

請注意， Parser只是ReadP a同義詞。

我還想在解析器中編碼變量名應以小寫字母開頭。

謝謝您的幫助。

Answer 1

問題的一部分在於您使用+++運算符。 以下代碼對我有用：

import Data.Char
import Text.ParserCombinators.ReadP

type Parser a = ReadP a
type Ident = String

identToken :: Parser String
identToken = do c <- satisfy lowerCase
                cs <- letdigs
                return (c:cs)
  where lowerCase = \x -> x `elem` ['a'..'z']
        underscore = \x -> x == '_'
        letdigs = munch (\c -> isLetter c || isDigit c || underscore c)

ident :: Parser Ident
ident = do _ <- skipSpaces
           s <- identToken
           skipSpaces
           return s

test_parseIdents :: String -> Either String [Ident]
test_parseIdents p = case readP_to_S prog p of
    [(j, "")]   -> Right j
    []          -> Left "Invalid parse"
    multipleRes -> Left ("Ambiguous idents: " ++ show multipleRes)
  where prog :: Parser [Ident]
        prog = do result <- many ident
                  eof
                  return result

main = print $ test_parseIdents "test_1349_zefz"

所以出了什么問題：

+++在其參數上強加一個順序，並允許多種選擇成功（對稱選擇）。 <++是左偏的，因此只有最左邊的選項才能成功->這樣可以消除解析過程中的歧義，但仍然存在下一個問題。
解析器正在尋找字母， 然后再數字，最后強調。 例如，下划線后的數字將失敗。 解析器必須進行修改，以munch那要么字母，數字或下划線字符。

我還刪除了一些未使用的函數，並對數據類型的定義進行了有根據的猜測。

Haskell讀取變量名

問題描述

1 個解決方案

解決方案1
3 已采納 2015-09-24 07:10:51

Haskell讀取變量名

問題描述

1 個解決方案

解決方案1 3 已采納 2015-09-24 07:10:51

解決方案1
3 已采納 2015-09-24 07:10:51