如何在Parsec的monadic上下文中返回多个解析失败？

Question

I have a grammar I am parsing which consists of exactly two required and unique logical parts, Alpha and Beta . 我正在解析一个语法，该语法恰好由两个必需且唯一的逻辑部分Alpha和Beta 。 These parts can be defined in any order, Alpha before Beta or visa-vera. 这些部分可以按任何顺序定义，即Beta或Visa-vera之前的Alpha 。 I would like to provide robust error messages for the less tech-savvy users. 我想为技术不太熟练的用户提供可靠的错误消息。

In the example below there are cases where multiple parse failures exist. 在下面的示例中，存在多个解析失败的情况。 I concatenate the failure message String s with the unlines function and pass the resulting concatenation into the fail combinator. 我将失败消息String与unlines函数连接起来，并将所产生的串联传递给fail组合器。 This creates a ParseError value with a single Message value when parse is called on grammarDefinition . 当在grammarDefinition上调用parse时，这将创建一个具有单个 Message值的ParseError值。

Example Scenario: 示例场景：

import Data.Either                   (partitionEithers)
import Data.Set                      (Set)
import Text.Parsec                   (Parsec)
import Text.Parsec.Char
import Text.ParserCombinators.Parsec

data Result = Result Alpha Beta
type Alpha  = Set (Int,Float)
type Beta   = Set String

grammarDefinition :: Parsec String u Result
grammarDefinition = do
    segments <- partitionEithers <$> many segment
    _        <- eof
    case segments of
      (     [],      []) -> fail $ unlines [missingAlpha, missingBeta]
      (      _,      []) -> fail $ missingBeta
      (     [],       _) -> fail $ missingAlpha
      ((_:_:_), (_:_:_)) -> fail $ unlines [multipleAlpha, multipleBeta]
      (      _, (_:_:_)) -> fail $ multipleBeta
      ((_:_:_),       _) -> fail $ multipleAlpha
      (    [x],     [y]) -> pure $ Result x y
    where
      missingAlpha     = message "No" "alpha"
      missingBeta      = message "No" "beta"
      multipleAlpha    = message "Multiple" "alpha"
      multipleBeta     = message "Multiple" "beta"
      message x y      = concat [x," ",y," defined in input, ","exactly one ",y," definition required"]

-- Type signature is important!
segment :: Parsec String u (Either Alpha Beta)
segment = undefined -- implementation irrelevant

I would like the ParseError to contain multiple Message values in the case of multiple failures. 在多个失败的情况下，我希望ParseError包含多个 Message值。 This should be possible due to the existence of the addErrorMessage function. 由于存在addErrorMessage函数，因此这应该可行。 I am not sure hw to supply multiple failure within the Parsec monadic context, before the result is materialized by calling parse . 我不确定在调用parse实现结果之前，如何在Parsec单子上下文中提供多个失败。

Example Function: 示例功能：

fails :: [String] -> ParsecT s u m a
fails = undefined -- Not sure how to define this!

How do I supply multiple 我如何提供多个 Message values to the 的价值观 ParseError result within Parsec's monadic context? 在 Parsec的单子语境中得出结果？

Answer 1

fail in this case is equivalent to parserFail defined in Text.Parsec.Prim : 在这种情况下， fail等效于parserFail定义的Text.Parsec.Prim ：

parserFail :: String -> ParsecT s u m a
parserFail msg
    = ParsecT $ \s _ _ _ eerr ->
      eerr $ newErrorMessage (Message msg) (statePos s)

Since newErrorMessage and addErrorMessage both create a ParseError , this variation of parserFail should also work: 由于newErrorMessage和addErrorMessage都创造ParseError ，这种变化parserFail也应努力：

parserFail' :: String -> ParsecT s u m a
parserFail' msg
    = ParsecT $ \s _ _ _ eerr ->
      eerr $ theMessages s
where
  theMessages s =
    addErrorMessage (Message "blah") $
      addErrorMessage (Expect "expected this") $
        newErrorMessage (Message msg) (statePos s)

which should push 3 messages onto the error message list. 它将3条消息推送到错误消息列表中。

Also in that module, have a look at label and labels which is the only place where addErrorMessage is used. 同样在该模块中，查看label和labels ，这是唯一使用addErrorMessage地方。 labels is just a multi-message version of the <?> operator. labels只是<?>运算符的多消息版本。 Note how it uses foldr to build up a compound error message: 请注意，它如何使用文件foldr构建复合错误消息：

labels :: ParsecT s u m a -> [String] -> ParsecT s u m a
labels p msgs =
    ParsecT $ \s cok cerr eok eerr ->
    let eok' x s' error = eok x s' $ if errorIsUnknown error
                  then error
                  else setExpectErrors error msgs
        eerr' err = eerr $ setExpectErrors err msgs

    in unParser p s cok cerr eok' eerr'

 where
   setExpectErrors err []         = setErrorMessage (Expect "") err
   setExpectErrors err [msg]      = setErrorMessage (Expect msg) err
   setExpectErrors err (msg:msgs)
       = foldr (\msg' err' -> addErrorMessage (Expect msg') err')
         (setErrorMessage (Expect msg) err) msgs

The only gatcha is that you need access to the ParsecT constructor which is not exported by Text.Parsec.Prim . 唯一的问题是您需要访问不是由Text.Parsec.Prim导出的ParsecT构造Text.Parsec.Prim 。 Maybe you can find a way to use labels or another way around that problem. 也许您可以找到一种使用labels或解决该问题的另一种方法。 Otherwise you could always include your own hacked version of parsec with your code. 否则，您始终可以在代码中包含自己的parsec hacked版本。

Answer 2

We can leverage the fact that ParsecT is an instance of MonadPlus to combine the definition of mzero with the function labels to derive the desired result: 我们可以利用的事实， ParsecT是一个实例MonadPlus来定义结合mzero与功能labels以获得期望的结果：

fails :: [String] -> ParsecT s u m a
fails = labels mzero

Note: The ParseError has many Expect values, not many Message values... 注意： ParseError具有许多Expect值，而没有许多Message值...

Answer 3

I would recommend transitioning from Parsec to newer and more extensible Megaparsec library. 我建议从Parsec过渡到更新和可扩展的Megaparsec库。

This exact issue has been resolved since version 4.2.0.0 . 从4.2.0.0版本4.2.0.0 此确切问题已得到解决。

Multiple parse error Message s can easily be created with the following function: 可以使用以下函数轻松创建多个解析错误Message ：

fails :: MonadParsec m => [String] -> m a
fails = failure . fmap Message

如何在Parsec的monadic上下文中返回多个解析失败？

问题描述

3 个解决方案

解决方案1
2 2015-09-15 19:59:52

解决方案2
0 2015-09-16 14:49:05

解决方案3
0 已采纳 2015-12-19 02:17:44

如何在Parsec的monadic上下文中返回多个解析失败？

问题描述

3 个解决方案

解决方案1 2 2015-09-15 19:59:52

解决方案2 0 2015-09-16 14:49:05

解决方案3 0 已采纳 2015-12-19 02:17:44

解决方案1
2 2015-09-15 19:59:52

解决方案2
0 2015-09-16 14:49:05

解决方案3
0 已采纳 2015-12-19 02:17:44