简体   繁体   English

Parboiled2:如何处理相关字段?

[英]Parboiled2: How to process dependent fields?

I'm trying to parse a file format, using the excellent parboiled2 library, in which the presence of some fields is dependent upon the value of one or more fields already processed. 我正在尝试使用出色的parboiled2库解析文件格式,其中某些字段的存在取决于一个或多个已处理字段的值。

For example, say I have two fields, the first of which is a flag indicating whether the second is present. 例如,假设我有两个字段,第一个字段是一个标志,指示第二个字段是否存在。 That is, if the first field is true , then the second field (which is an integer value, in this example) is present and must be processed - but if it's false , then the second field isn't present at all. 也就是说,如果第一个字段为true ,则第二个字段(在此示例中为整数值)存在并且必须进行处理-但是,如果它为false ,则第二个字段根本不存在。 Note that this second field isn't optional - it either must be processed (if the first field is true ) or must not be processed (if the first field is false ). 请注意,第二个字段不是可选的 - 必须对其进行处理(如果第一个字段为true )或必须不进行处理(如果第一个字段为false )。

So, if a third field (which we'll assume is always present) is a quoted string, both of the following lines are valid: 因此,如果第三个字段(我们假定始终存在)是带引号的字符串,则以下两行均有效:

true 52 "Some quoted string"
false "Some other quoted string"

But this would be invalid: 但这将是无效的:

false 25 "Yet another quoted string"

Ignoring the third field, how do I write a rule to parse the first two? 忽略第三个字段,如何编写规则来解析前两个字段? (I can't tell from the documentation, and Googling hasn't helped so far...) (我无法从文档中得知,并且Googling到目前为止还没有帮助...)

UPDATE : I should clarify that I can't use rules like the following, because the format I'm parsing is actually a lot more complicated than my example: 更新 :我应该澄清一下,我不能使用如下规则,因为我解析的格式实际上比示例复杂得多:

import org.parboiled2._

class MyParser(override val input: ParserInput)
extends Parser {

  def ws = // whitepsace rule, puts nothing on the stack.

  def intField = // parse integer field, pushes Int onto stack...

  def dependentFields = rule {
    ("true" ~ ws ~ intField) | "false" ~> //etc.
  }
}

UPDATE 2 : I've revised the following to make my intent clearer: 更新2 :我修改了以下内容以使意图更清晰:

What I'm looking for is a valid equivalent to the following (non-existent) rule that performs a match only if a condition is satisfied: 我要寻找的是等效于以下(不存在)的规则的等效项,该规则仅在满足条件时才执行匹配:

import org.parboiled2._

class MyParser(input: ParserInput)
extends Parser {

  def ws = // whitepsace rule, puts nothing on the stack.

  def intField = // parse integer field, pushes Int onto stack...

  def boolField = // parse boolean field, pushes Boolean onto stack...

  def dependentFields = rule {
    boolField ~> {b =>

      // Match "ws ~ intField" only if b is true. If match succeeds, push Some(Int); if match
      // fails, the rule fails. If b is false, pushes None without attempting the match.
      conditional(b, ws ~ intField)
    }
  }
}

That is, ws ~ intField is only matched if boolField results in a true value. 也就是说, ws ~ intField仅在boolField结果为true时才匹配。 Is something like this possible? 这样的事情可能吗?

Yes, you can implement such a function with the help of test parser action: 是的,您可以在test解析器操作的帮助下实现这样的功能:

def conditional[U](bool: Boolean, parse: () => Rule1[U]): Rule1[Option[U]] = rule {
  test(bool) ~ parse() ~> (Some(_)) | push(None)
}

According to the Meta-Rules section of the documentation, it can work only by passing a function to produce rules. 根据文档的Meta-Rules部分 ,它只能通过传递函数来产生规则来工作。 You'd have to define dependentFields rule as follows: 您必须按以下方式定义dependentFields规则:

def dependentFields = rule {
  boolField ~> (conditional(_, () => rule { ws ~ intField }))
}

Update: 更新:

While test(pred) ~ opt1 | opt2 while test(pred) ~ opt1 | opt2 test(pred) ~ opt1 | opt2 is a common technique, it does backtrack and tries to apply opt2 , if test is successful test , but opt1 fails. test(pred) ~ opt1 | opt2是一种常见的技术, 原路返回,并尝试应用opt2 ,如果test成功test ,但opt1失败。 Here are two possible solutions to prevent such backtracking. 这是防止这种回溯的两种可能的解决方案。

You can use ~!~ rule combinator, that has "cut" semantics and prohibits backtracking over itself: 您可以使用~!~规则组合器,该规则组合器具有“ cut”语义并禁止在其自身上回溯:

def conditional2[U](bool: Boolean, parse: () => Rule1[U]): Rule1[Option[U]] = rule {
  test(bool) ~!~ parse() ~> (Some(_)) | push(None)
}

Or you actually use if outside of a rule to check the boolean argument and return one of two possible rules: 或者,您实际上使用if 规则之外来检查布尔参数并返回两个可能规则之一:

def conditional3[U](bool: Boolean, parse: () => Rule1[U]): Rule1[Option[U]] =
  if (bool) rule { parse() ~> (Some(_: U)) } 
  else rule { push(None) }

I would do something like this: 我会做这样的事情:

extends Parser {

  def dependentFields: Rule1[(Boolean, Option[Int], String)] = rule {
     ("true" ~ ws ~ trueBranch | "false" ~ ws ~ falseBranch)
  }

  def trueBranch = rule {
     intField ~ ws ~ stringField ~> { (i, s) => (true, Some(i), s) }
  }

  def falseBranch = rule {
     stringField ~> { s => (false, None, s) }
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM