Parboiled2：如何处理相关字段？

Question

I'm trying to parse a file format, using the excellent parboiled2 library, in which the presence of some fields is dependent upon the value of one or more fields already processed. 我正在尝试使用出色的parboiled2库解析文件格式，其中某些字段的存在取决于一个或多个已处理字段的值。

For example, say I have two fields, the first of which is a flag indicating whether the second is present. 例如，假设我有两个字段，第一个字段是一个标志，指示第二个字段是否存在。 That is, if the first field is true , then the second field (which is an integer value, in this example) is present and must be processed - but if it's false , then the second field isn't present at all. 也就是说，如果第一个字段为true ，则第二个字段（在此示例中为整数值）存在并且必须进行处理-但是，如果它为false ，则第二个字段根本不存在。 Note that this second field isn't optional - it either must be processed (if the first field is true ) or must not be processed (if the first field is false ). 请注意，第二个字段不是可选的 - 必须对其进行处理（如果第一个字段为true ）或必须不进行处理（如果第一个字段为false ）。

So, if a third field (which we'll assume is always present) is a quoted string, both of the following lines are valid: 因此，如果第三个字段（我们假定始终存在）是带引号的字符串，则以下两行均有效：

true 52 "Some quoted string"
false "Some other quoted string"

But this would be invalid: 但这将是无效的：

false 25 "Yet another quoted string"

Ignoring the third field, how do I write a rule to parse the first two? 忽略第三个字段，如何编写规则来解析前两个字段？ (I can't tell from the documentation, and Googling hasn't helped so far...) （我无法从文档中得知，并且Googling到目前为止还没有帮助...）

UPDATE : I should clarify that I can't use rules like the following, because the format I'm parsing is actually a lot more complicated than my example: 更新：我应该澄清一下，我不能使用如下规则，因为我解析的格式实际上比示例复杂得多：

import org.parboiled2._

class MyParser(override val input: ParserInput)
extends Parser {

  def ws = // whitepsace rule, puts nothing on the stack.

  def intField = // parse integer field, pushes Int onto stack...

  def dependentFields = rule {
    ("true" ~ ws ~ intField) | "false" ~> //etc.
  }
}

UPDATE 2 : I've revised the following to make my intent clearer: 更新2 ：我修改了以下内容以使意图更清晰：

What I'm looking for is a valid equivalent to the following (non-existent) rule that performs a match only if a condition is satisfied: 我要寻找的是等效于以下（不存在）的规则的等效项，该规则仅在满足条件时才执行匹配：

import org.parboiled2._

class MyParser(input: ParserInput)
extends Parser {

  def ws = // whitepsace rule, puts nothing on the stack.

  def intField = // parse integer field, pushes Int onto stack...

  def boolField = // parse boolean field, pushes Boolean onto stack...

  def dependentFields = rule {
    boolField ~> {b =>

      // Match "ws ~ intField" only if b is true. If match succeeds, push Some(Int); if match
      // fails, the rule fails. If b is false, pushes None without attempting the match.
      conditional(b, ws ~ intField)
    }
  }
}

That is, ws ~ intField is only matched if boolField results in a true value. 也就是说， ws ~ intField仅在boolField结果为true时才匹配。 Is something like this possible? 这样的事情可能吗？

Answer 1

Yes, you can implement such a function with the help of test parser action: 是的，您可以在test解析器操作的帮助下实现这样的功能：

def conditional[U](bool: Boolean, parse: () => Rule1[U]): Rule1[Option[U]] = rule {
  test(bool) ~ parse() ~> (Some(_)) | push(None)
}

According to the Meta-Rules section of the documentation, it can work only by passing a function to produce rules. 根据文档的Meta-Rules部分，它只能通过传递函数来产生规则来工作。 You'd have to define dependentFields rule as follows: 您必须按以下方式定义dependentFields规则：

def dependentFields = rule {
  boolField ~> (conditional(_, () => rule { ws ~ intField }))
}

Update: 更新：

While test(pred) ~ opt1 | opt2 while test(pred) ~ opt1 | opt2 test(pred) ~ opt1 | opt2 is a common technique, it does backtrack and tries to apply opt2 , if test is successful test , but opt1 fails. test(pred) ~ opt1 | opt2是一种常见的技术，它原路返回，并尝试应用opt2 ，如果test成功test ，但opt1失败。 Here are two possible solutions to prevent such backtracking. 这是防止这种回溯的两种可能的解决方案。

You can use ~!~ rule combinator, that has "cut" semantics and prohibits backtracking over itself: 您可以使用~!~规则组合器，该规则组合器具有“ cut”语义并禁止在其自身上回溯：

def conditional2[U](bool: Boolean, parse: () => Rule1[U]): Rule1[Option[U]] = rule {
  test(bool) ~!~ parse() ~> (Some(_)) | push(None)
}

Or you actually use if outside of a rule to check the boolean argument and return one of two possible rules: 或者，您实际上使用if 在规则之外来检查布尔参数并返回两个可能规则之一：

def conditional3[U](bool: Boolean, parse: () => Rule1[U]): Rule1[Option[U]] =
  if (bool) rule { parse() ~> (Some(_: U)) } 
  else rule { push(None) }

Answer 2

I would do something like this: 我会做这样的事情：

extends Parser {

  def dependentFields: Rule1[(Boolean, Option[Int], String)] = rule {
     ("true" ~ ws ~ trueBranch | "false" ~ ws ~ falseBranch)
  }

  def trueBranch = rule {
     intField ~ ws ~ stringField ~> { (i, s) => (true, Some(i), s) }
  }

  def falseBranch = rule {
     stringField ~> { s => (false, None, s) }
  }
}

Parboiled2：如何处理相关字段？

问题描述

2 个解决方案

解决方案1
2 已采纳 2018-02-21 12:31:13

解决方案2
0 2018-02-21 03:10:19

Parboiled2：如何处理相关字段？

问题描述

2 个解决方案

解决方案1 2 已采纳 2018-02-21 12:31:13

解决方案2 0 2018-02-21 03:10:19

解决方案1
2 已采纳 2018-02-21 12:31:13

解决方案2
0 2018-02-21 03:10:19