简体   繁体   English

Scala解析器组合器:处理某种类型的重复

[英]Scala Parser Combinators :Handling Repetition of a type

I am new to scala language and its parser combinators. 我是scala语言及其解析器组合器的新手。 I was working on a task and got stuck at a requirement: My requirement is to get repetitive Type for eg: I created parser for logical operator and word (which means string) 我正在执行一项任务,并遇到了一个要求:我的要求是获取重复Type ,例如:我为逻辑运算符和单词(表示字符串)创建了解析器

def logicalOperator:Parser[Operator] = "(&&|\\|\\|)".r ^^ {case op => Operator(op)} 

def word: Parser[word] = """[a-z\\\s]+""".r ^^ { case x => word(x) }

Now My input may contain a single word or repetitive words separated by multiple operators. 现在,我的输入可能包含单个单词或由多个运算符分隔的重复单词。 For ex: 例如:

   input1=> once&&upon||time||forest.

    input2=> therewasatime // this is also a valid input , it does not have any operators

I would process words as per the operators between them.In case there is no operator present (ie input is a single word , I would process on single word). 我将根据它们之间的运算符来处理单词。如果不存在运算符(即输入是单个单词,我将处理单个单词)。

&& operator and || operator would decide the operation. (we can consider it to be  similar to && and || operator in case of boolean values , to understand clearly )

I was thinking of a case class Sentence , which would represent a single word as well as multiple words . 我当时在想一个案例类句子,它将代表一个单词以及多个单词。 And in case of multiple words it would contain operator.In case single word, operator and second word would be null 如果是多个单词,则包含运算符;如果是单个单词,则运算符和第二个单词为null

case class Sentence(word1:Word, op:Operator, word2:Word).

So this would be a tree structure with leaf node contains only Word and rest nodes would contain operators. 因此,这将是一个树结构,其中叶节点仅包含Word,其余节点将包含运算符。

But I am not sure how to write Sentence Parser. 但是我不确定如何编写Sentence Parser。 I tried using : 我尝试使用:

def sentence = repsep(word,logicalOperator)^^{// creating sentence object here}.

But I cannot extract operator from repsep() operation. 但是我不能从repsep()操作中提取运算符。

Any suggestion for this case ? 对这种情况有什么建议吗?

thanks 谢谢

The problem with repsep is that is discards the result of its second argument, you won't be able to identify which operator was used. repsep的问题是丢弃第二个参数的结果,您将无法识别使用了哪个运算符。 Another things is: How do you want once&&upon||time||forest to be represented, if Sentence can only contain Words, not other Sentences. 另一件事是:如果Sentence只能包含单词,而不能包含其他句子,则如何表示once&&upon||time||forest In the following, I assumed you meant something like this: 在下文中,我假设您的意思是这样的:

trait Node
case class Operator(s: String)
case class Word(s: String) extends Node
case class Sentence(a: Node, op: Operator, b: Node) extends Node

Then you can write sentence like this: 然后,您可以这样写sentence

def sentence: Parser[Node] = word ~ opt(logicalOperator ~ sentence) ^^ {
    case w ~ Some(op ~ sentence) ⇒ Sentence(w, op, sentence)
    case w ~ None                ⇒ w
}

With this method, once&&upon||time||forest is parsed as 使用此方法, once&&upon||time||forest被解析为

Sentence(
    Word(once),
    Operator(&&),
    Sentence(
        Word(upon),
        Operator(||),
        Sentence(
            Word(time),
            Operator(||),
            Word(forest)
        )
    )
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM