简体   繁体   English

Python PLY Yacc“语法错误”

[英]Python PLY Yacc “syntax error”

Okay, so I'm trying to build a parser of my mini-language (obviously), and setting variables seems to be properly working. 好的,因此,我正在尝试构建我的迷你语言的解析器(显然),并且设置变量似乎可以正常工作。 But as soon as Yacc comes across a function definition, it just gives me a syntax error, and a couple of EOF errors (which I know are from when Yacc has no remaining rules to set) and nothing else happens... Where did I go wrong? 但是,一旦Yacc遇到函数定义,它就给我一个语法错误,以及几个EOF错误(我知道这些错误来自Yacc没有设置剩余规则的情况),然后什么也没有发生……我在哪里?出错?

Here's an example of the syntax I'm parsing: 这是我正在解析的语法的示例:

$name = "John Doe"
$age = 72
$waterInOceans = 95.4

!testFunction {

}

Where the !testFunction { } section is defining a function (based off of the exclamation point). !testFunction { }部分在其中定义函数(基于感叹号)。 I don't know if that's going to be useful in debugging. 我不知道这在调试中是否有用。

# The Lexer

import ply.lex as lex

tokens = ["MINUS", "SEPARATOR", "MODIFIER", "FUNCTION_NAME", "UNDEF_BLOCK", "VARIABLE_NAME", "EQUALS", "STRING", "FLOAT", "INNER_CONTENT", "ARGUMENTS", "INTEGER", "PLUS"]

def t_ARGUMENTS(t): # Finds arguments in calls and function definitions
    r'\(.*\)'
    t.value = t.value[1:-1] # strip parenthesis
    t.value = t.value.split(" && ")
    return t

def t_STRING(t): # finds strings
    r'"\w.+"'
    t.value = t.value[1:-1] # strips the quotation marks of the string
    return t

def t_FLOAT(t): # finds floats
    r'\d+.\d+'
    t.value = float(t.value)
    return t

def t_INTEGER(t):
    r'\d+'
    t.value = int(t.value)
    return t

def t_VARIABLE_NAME(t):
    r'\$\w*\b'
    t.value = t.value[1:]
    return t

def t_INNER_CONTENT(t):
    r'\{\n.*\n\}|\{.*\}'
    t.value = t.value[1:-1]
    return t

def t_FUNCTION_NAME(t):
    r'!\w+'
    t.value = t.value[1:]
    return t

t_ignore = r"\n|\t|\r"
t_EQUALS = r"\="
t_PLUS = r"\+"
t_MINUS = r"-"
t_MODIFIER = r"\."
t_SEPARATOR = r"\,"

t_UNDEF_BLOCK = r"\w+" # Any block of text that is left over and isn't assigned by the end (used by functions)

def t_error(t):
    t.lexer.skip(1)

lex.lex()

#opened = open("example.zeq", "r")
#content = opened.read()
#opened.close()

#lex.input(content)

And then the Yacc half: 然后是Yacc的一半:

# The Yacc parser

import ply.yacc as yacc
import compiler # Get the compiler (tokenizer; compiler.py) which generates tokens
import sys
from os import system


##############
### IGNORE ###
tokens = compiler.tokens
#system("clear")
print("Executing "+sys.argv[1]+" |\n"+("-"*(len(sys.argv[1])+12)))
### IGNORE ###
##############


VARIABLES = {}
FUNCTIONS = {}

def p_assign(p): # Set new variable
    '''assignment : VARIABLE_NAME EQUALS compound
                  | VARIABLE_NAME EQUALS STRING
                  | VARIABLE_NAME EQUALS INTEGER
                  | VARIABLE_NAME EQUALS FLOAT'''

    #print("Setting '{}' to '{}'...".format(str(p[1]), str(p[3])))
    VARIABLES[p[1]] = p[3]

def p_number(p): # Combines floats and integers into a blanket non-terminal for simplicity sakes
    '''number : FLOAT
              | INTEGER'''
    p[0] = p[1]

def p_compound(p): # Complete the value *before* the variable is assigned!
    '''compound : number PLUS number
                | number MINUS number'''

    type1 = type(p[1])
    type2 = type(p[3])
    operator = p[2]
    if operator == "+":
        p[0] = p[1] + p[3]
    elif operator == "-":
        p[0] = p[1] - p[3]

def p_undefined(p):
    '''undefined : UNDEF_BLOCK'''
    print("Undefined block")

def p_function(p):
    '''function : FUNCTION_NAME INNER_CONTENT'''

    print("Creating a function")

    name = p[1]
    content = p[2]

    FUNCTIONS[name] = content

def p_empty(p):
    '''empty : '''

#~ def p_error(p):
    #~ if p:
        #~ print("Syntax error: "+p.type)
    #~ else:
        #~ pass

parser = yacc.yacc()

opened = open(sys.argv[1], "r")
content = opened.read()
opened.close()

for line in content.splitlines():
    parser.parse(line)

print(VARIABLES)
print(FUNCTIONS)

I'm waiting for it to be a simple overlooked detail... 我正在等待它成为一个简单的被忽略的细节...

When you ask Ply (or yacc, for that matter) to parse an input, it attempts to recognize a single instance of the top-level non-terminal (or "starting symbol"). 当您要求Ply(或yacc)解析输入时,它将尝试识别顶级非终端(或“起始符号”)的单个实例。 This will usually a grammatical description of the entire input, so it will often have a name like program , although there are use cases in which it is useful to parse just a part of the input. 这通常是整个输入的语法描述,因此它通常具有类似于program的名称,尽管在某些用例中,仅解析部分输入很有用。

Ply (and yacc) assume that the first grammar production is for the starting symbol. Ply(和yacc)假定第一个语法产生用于起始符号。 In your case, the first production is assignment , and so that is what it will try to parse (and nothing else). 在您的情况下,第一个产生式是assignment ,因此它将尝试解析(而不是其他)。 assignment cannot derive a function definition or any other statement type, so those cause syntax errors. assignment不能派生函数定义或任何其他语句类型,因此它们会导致语法错误。

If you want to explicitly tell Ply what the top-level symbol is, you can do so. 如果要明确告诉Ply顶级符号是什么,则可以这样做。 See the manual section on starting symbols . 请参阅手册中有关起始符号的部分

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM