简体   繁体   English

考虑到“|”,如何创建抽象语法树? (层/Yacc)

[英]How can I create an abstract syntax tree considering '|'? (Ply / Yacc)

Considering the following grammar:考虑以下语法:

expr : expr '+' term | expr '-' term | term
term : term '*' factor | term '/' factor | factor
factor : '(' expr ')' | identifier | number

This is my code using ply:这是我使用ply的代码:

from ply import lex, yacc

tokens = [
    "identifier",
    "number",
    "plus",
    "minus",
    "mult",
    "div"
]

t_ignore = r" \t"
t_identifier = r"^[a-zA-Z]+$"
t_number = r"[+-]?(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?"
t_plus = r"\+"
t_minus = r"-"
t_mult = r"\*"
t_div = r"/"

def p_stmt(p):
    """stmt : expr"""
    p[0] = ("stmt", p[1])

def p_expr(p):
    """expr : expr plus term 
            | expr minus term 
            | term"""
    p[0] = ("expr", p[1], p[2]) # Problem here <<<

def p_term(p):
    """term : term mult factor 
            | term div factor 
            | factor"""

def p_factor(p):
    """factor : '(' expr ')' 
              | identifier 
              | number"""


if __name__ == "__main__":
    lex.lex()
    yacc.yacc()
    data = "32 + 10"
    result = yacc.parse(data)
    print(result)

How am I supposed to build an AST with the expression if I can't access the operators?如果我无法访问运算符,我应该如何使用表达式构建 AST? I could separate the functions like p_expr_plus, but in this case, I would eliminate operator precedence.我可以像 p_expr_plus 这样的函数分开,但在这种情况下,我将消除运算符优先级。 The docs are not so helpful, since I'm a beginner and can't solve this problem.文档不是很有帮助,因为我是初学者,无法解决这个问题。 The best material I've found on the subject is this , but it does not consider the complexity of operator precedence.我在这个主题上找到的最好的材料是 this ,但它没有考虑运算符优先级的复杂性。

EDIT: I can't access p 2 or p[3], since I get an IndexError (It's matching the term only).编辑:我无法访问 p 2或 p[3],因为我得到了一个 IndexError(它只匹配术语)。 In the PDF I've linked, they explicitly put the operator inside the tuple, like: ('+', p 1 , p 2 ), and thus, evincing my problem considering precedence (I can't separate the functions, the expression is the expression, there should be a way to consider the pipes and access any operator).在我链接的 PDF 中,他们明确地将运算符放在元组中,例如: ('+', p 1 , p 2 ),因此,考虑到优先级证明了我的问题(我无法将函数、表达式分开是表达式,应该有一种方法来考虑管道并访问任何运算符)。

As far as I can see, in p[0] = ("expr", p[1], p[2]) , p[1] would be the left hand expression, p[2] would be the operator, and p[3] (that you aren't using) would be the right hand term.据我所知,在p[0] = ("expr", p[1], p[2])p[1]将是左手表达式, p[2]将是运算符,并且p[3] (您没有使用)将是右手术语。

Just use p[2] to determine the operator, add p[3] , since you will need it, and you should be good to go.只需使用p[2]来确定运算符,添加p[3] ,因为您将需要它,并且您应该很高兴。

Also, you must verify how many items p has, since if the last rule, | term"""此外,您必须验证p有多少项,因为如果最后一条规则是| term""" | term""" is matched, p will only have two items instead of four. | term"""匹配, p将只有两个项目而不是四个。

Take a look at a snippet from the GardenSnake example:看一下GardenSnake 示例中的一个片段

def p_comparison(p):
    """comparison : comparison PLUS comparison
                  | comparison MINUS comparison
                  | comparison MULT comparison
                  | comparison DIV comparison
                  | comparison LT comparison
                  | comparison EQ comparison
                  | comparison GT comparison
                  | PLUS comparison
                  | MINUS comparison
                  | power"""
    if len(p) == 4:
        p[0] = binary_ops[p[2]]((p[1], p[3]))
    elif len(p) == 3:
        p[0] = unary_ops[p[1]](p[2])
    else:
        p[0] = p[1]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM