简体   繁体   English

解析算术表达式内的表达式

[英]Parsing expressions inside arithmetic expressions

I would like to parse arithmetic expressions. 我想解析算术表达式。

Here is my current parser: 这是我当前的解析器:

data AExpr
    = ExprAsAExpr Expr
    | IntConst Integer
    | Neg AExpr
    | ABinary ABinOp AExpr AExpr
    deriving (Show, Eq)

aExpr :: Parser AExpr
aExpr = makeExprParser aTerm aOperators

aTerm :: Parser AExpr
aTerm
    =   parens aExpr
    <|> IntConst <$> integerParser

aOperators :: [[Operator Parser AExpr]]
aOperators =
    [ [Prefix (Neg <$ symbol "-") ]
    , [ InfixL (ABinary Multiply <$ symbol "*")
      , InfixL (ABinary Divide   <$ symbol "/") ]
    , [ InfixL (ABinary Add      <$ symbol "+")
      , InfixL (ABinary Subtract <$ symbol "-") ]
    ]

Using this I can correctly parse this: 使用这个我可以正确地解析这个:

1 + 2

Generating an AST like this. 生成这样的AST。

(ABinary Add (IntConst 1) (IntConst 2))

Another thing I can parse are general expressions. 我可以解析的另一件事是通用表达式。 These can be things such as variables, method calls, ternaries etc. 这些可以是变量,方法调用,三元等。

Eg 例如

Identifiers: 身份标识:

varName

This generates: 这将产生:

(Identifier (Name "varName"))

Method Calls: 方法调用:

methodCall()

This generates: 这将产生:

(MethodCall (Name "methodCall") (BlockExpr []))

Here's an example for parsing general expressions. 这是解析通用表达式的示例。

expressionParser :: Parser Expr
expressionParser
    =   methodCallParser
    <|> identifierParser

This works fine but I would also like to parse arithmetic expressions in this. 这工作正常,但我也想在此解析算术表达式。

expressionParser :: Parser Expr
expressionParser
    =   newClassInstanceParser
    <|> methodCallParser
    <|> AExprAsExpr <$> aExpr
    <|> identifierParser

This means using the expressionParser I can now parse all the different expressions including arithmetic expressions. 这意味着使用expressionParser我现在可以解析所有不同的表达式,包括算术表达式。 If it happens to be an arithmetic expression it gets wrapped in AExprAsExpr . 如果碰巧是算术表达式,则将其包装在AExprAsExpr

Problem 问题

I would like to parse arithmetic expressions containing other expressions. 我想解析包含其他表达式的算术表达式。

Eg 例如

x + y

To do this my original thought was to change the arithmetic parser so it also parses expressions. 为此,我最初的想法是更改算术解析器,以便它也解析表达式。

data AExpr
    = ExprAsAExpr Expr
    | IntConst Integer
    | Neg AExpr
    | ABinary ABinOp AExpr AExpr
    deriving (Show, Eq)

aExpr :: Parser AExpr
aExpr = makeExprParser aTerm aOperators

aTerm :: Parser AExpr
aTerm
    =   parens aExpr
    <|> IntConst <$> integerParser
    <|> ExprAsAExpr <$> expressionParser

aOperators :: [[Operator Parser AExpr]]
aOperators =
    [ [Prefix (Neg <$ symbol "-") ]
    , [ InfixL (ABinary Multiply <$ symbol "*")
      , InfixL (ABinary Divide   <$ symbol "/") ]
    , [ InfixL (ABinary Add      <$ symbol "+")
      , InfixL (ABinary Subtract <$ symbol "-") ]
    ]

The problem with this is there is a recursive loop as aTerm calls the expression parser, the expression parser calls aExpr . 这个问题是当aTerm调用表达式解析器,表达式解析器调用aExpr存在递归循环。 This leads to an infinite loop. 这导致无限循环。 There is also the issue that all identifiers will now be wrapped inside an AExprAsExpr . 还有一个问题是,所有identifiers现在都将被包装在AExprAsExpr

What is the correct method of parsing expressions inside arithmetic expressions? 在算术表达式内部解析表达式的正确方法是什么?

EDIT I just now realized that you are using makeExpressionParser and my answer doesn't really apply to that. 编辑我现在才意识到您正在使用makeExpressionParser而我的回答并不真正适用makeExpressionParser Anyway maybe this answer is still helpful? 无论如何,这个答案还是有帮助的?

Parsec is a type of recursive-descent parser, which means it cannot handle left recursion, as you are seeing. Parsec是递归下降解析器的一种,这意味着您无法处理左递归。 You need to factor it out, which can always be done if the grammar is context-free. 您需要将其排除在外,如果语法与上下文无关,则可以始终这样做。 One way you see this factorization done is by having a production for each precedence level. 您看到此分解完成的一种方法是为每个优先级生成一个乘积。 Here is an example grammar for simple arithmetic: 这是简单算术的示例语法:

expr ::= addExpr
addExpr ::= mulExpr '+' addExpr
          | mulExpr '-' addExpr
          | mulExpr
mulExpr ::= term '*' mulExpr
          | term '/' mulExpr
          | term
term ::= '(' expr ')'
       | number

Notice the pattern: the first symbol in each production calls down to the next more specific one. 注意模式:每个生产中的第一个符号将调出下一个更具体的符号。 Then explicit parentheses allow us to re-enter the top-level production. 然后,显式括号使我们可以重新输入顶级产品。 This is generally how operator precedence is expressed in recursive descent. 通常,这是在递归下降中表示运算符优先级的方式。

The above grammar can only produce right-nested expressions. 上面的语法只能产生右对齐的表达式。 While it will accept exactly the right strings, it does not correctly parse when the interpretation is left-associative. 尽管它将完全正确地接受字符串,但是当解释为左关联时,它不会正确解析。 In particular, 尤其是,

1 - 2 - 3 - 4

will be parsed as 将被解析为

1 - (2 - (3 - 4))

which is not correct according to our conventions. 根据我们的惯例,这是不正确的。 In a general recursive-descent parser you have to do some tricks to associate correctly here. 在一般的递归下降解析器中,您必须做一些技巧才能在此处正确关联。 In Parsec, however, we have the many combinators, which we can use to our advantage. 但是,在Parsec中,我们有many组合器,可以利用many组合器来发挥自己的优势。 For example, to parse left-associated subtraction, we could say 例如,要解析左相关减法,我们可以说

subExpr = foldl1 (-) <$> many1 mulExpr

The next level here are apparently the chainl combinators which seem to have been designed for just this purpose (though I just learned about it now -- guess I should have perused the docs more). 这里的下一个级别显然是chainl组合器,它似乎是专门为此目的而设计的(尽管我现在才了解它-猜想我应该更仔细地阅读文档)。 An example of using this would be 使用此示例

addExpr = chainl1 mulExpr oper
    where
    oper = choice [ (+) <$ symbol '+'
                  , (-) <$ symbol '-'
                  ]

I love writing parsers in Haskell. 我喜欢在Haskell中编写解析器。 Good luck! 祝好运!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM