简体   繁体   English

从用Scala Parser Combinators编写的解析器返回有意义的错误消息

[英]returning meaningful error messages from a parser written with Scala Parser Combinators

I try to write a parser in scala using Parser Combinators. 我尝试使用Parser Combinators在scala中编写一个解析器。 If I match recursively, 如果我递归匹配,

def body: Parser[Body] =
("begin" ~> statementList  )  ^^ {
     case s => {   new Body(s); }
}

def statementList : Parser[List[Statement]] = 
  ("end" ^^ { _ => List() } )|
  (statement ~ statementList ^^ { case statement ~ statementList => statement :: statementList  })

then I get good errormessages whenever there is a fault in a statement. 每当语句中有错误时,我都会得到很好的错误消息。 However, this is ugly long code. 但是,这是丑陋的长代码。 So I'd like to write this: 所以我想这样写:

def body: Parser[Body] =
("begin" ~> statementList <~ "end"  )  ^^ {
   case s => {   new Body(s); }
}

def statementList : Parser[List[Statement]] = 
    rep(statement)

This code works, but only prints meaningful messages if there is an error in the FIRST statement. 该代码有效,但仅在FIRST语句中有错误时才打印有意义的消息。 If it is in a later statement, the message becomes painfully unusable, because the parser wants to see the whole erroneous statement replaced by the "end" token: 如果它在后面的语句中,则该消息将变得非常痛苦,因为解析器希望看到整个错误的语句都被“ end”标记代替:

Exception in thread "main" java.lang.RuntimeException: [4.2] error: "end" expected but "let" found

 let b : string = x(3,b,"WHAT???",!ERRORHERE!,7 ) 

 ^ 

My question: is there a way to get rep and repsep working in combination with meaningful error messages, that place the caret on the right place instead of on the begin of the repeating fragment? 我的问题:是否有办法使reprepsep与有意义的错误消息结合使用,从而将插入符号放置在正确的位置,而不是重复片段的开头?

Ah, found the solution! 啊,找到解决办法了! It turns out that you need to use the function phrase on your main parser to return a new parser that is less inclined to track back. 事实证明,您需要在主解析器上使用功能短语来返回 不太倾向于回溯 的新解析器 (I wonder what it exactly means, perhaps that if it finds a line break it will not track back?) tracks the last position on wich an failure occured. (我想知道这到底是什么意思,也许如果找到换行符就不会回溯?) 跟踪发生故障的最后位置。

changed: 已更改:

def parseCode(code: String): Program = {
 program(new lexical.Scanner(code)) match {
      case Success(program, _) => program
      case x: Failure => throw new RuntimeException(x.toString())
      case x: Error => throw new RuntimeException(x.toString())
  }

}

def program : Parser[Program] ...

into: 变成:

def parseCode(code: String): Program = {
 phrase(program)(new lexical.Scanner(code)) match {
      case Success(program, _) => program
      case x: Failure => throw new RuntimeException(x.toString())
      case x: Error => throw new RuntimeException(x.toString())
  }

}


def program : Parser[Program] ...

You can do it by combining a "home made" rep method with non-backtracking inside statements. 您可以通过将“自制” rep方法与内部语句的非回溯组合来实现。 For example: 例如:

scala> object X extends RegexParsers {
     |   def myrep[T](p: => Parser[T]): Parser[List[T]] = p ~! myrep(p) ^^ { case x ~ xs => x :: xs } | success(List())
     |   def t1 = "this" ~ "is" ~ "war"
     |   def t2 = "this" ~! "is" ~ "war"
     |   def t3 = "begin" ~ rep(t1) ~ "end"
     |   def t4 = "begin" ~ myrep(t2) ~ "end"
     | }
defined module X

scala> X.parse(X.t4, "begin this is war this is hell end")
res13: X.ParseResult[X.~[X.~[String,List[X.~[X.~[String,String],String]]],String]] =
[1.27] error: `war' expected but ` ' found

begin this is war this is hell end
                          ^

scala> X.parse(X.t3, "begin this is war this is hell end")
res14: X.ParseResult[X.~[X.~[String,List[X.~[X.~[String,String],String]]],String]] =
[1.19] failure: `end' expected but ` ' found

begin this is war this is hell end
                  ^

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM