简体   繁体   English

野牛语法分解重复的标记/表达式?

[英]Bison Grammar breaks down on repeated tokens / expressions?

With a pretty basic Bison / Flex grammar, I'm trying to pull tokens / expressions into C++ objects to generate three op codes from (ie an internal representation). 借助基本的Bison / Flex语法,我试图将标记/表达式拉入C ++对象,以从中生成三个操作码(即内部表示形式)。 I'm doing this because this particular parser represents a smaller subset of a larger parser. 我这样做是因为这个特定的解析器代表了较大解析器的较小子集。 My problem comes with repeated expressions / tokens. 我的问题来自重复的表达式/标记。

For example: 例如:

10 + 55 will parse as 10 + 10. 10 + 55将解析为10 + 10。

10 + VARIABLLENAME will parse fine, as INT and VARIABLE are different tokens. 10 + VARIABLLENAME可以很好地解析,因为INT和VARIABLE是不同的标记。

55-HELLOWORLD / 100 will again parse fine, presumably because there is never two of the same token either side of the expression. 55-HELLOWORLD / 100将再次解析正常,大概是因为表达式的两边永远不会有两个相同的标记。

55-HELLOWORLD - 100 Seg Faults out. 55-HELLOWORLD-100段故障。 Repeating Operation tokens (ie -, +, /, etc causes the parser to crash). 重复操作令牌(即-,+,/等会导致解析器崩溃)。

TLDR: When repeating Value Types (ie INT, FLOAT, VARIABLE), the same token is returned twice. TLDR:当重复值类型(即INT,FLOAT,VARIABLE)时,相同的令牌将返回两次。 When repeating Operations, the parser seg faults. 重复操作时,解析器seg错误。

My presumption is something I'm doing when loading the $1/$3 values into class objects then adding them to the parser stack is the problem. 我的推论是将$ 1 / $ 3值加载到类对象中,然后将它们添加到解析器堆栈中时出现的问题。 I've tried checking the memory addresses of each variable + pointer I generate, and they all appear to be as I'd expect (ie im not overriting the same object). 我试过检查我生成的每个变量+指针的内存地址,它们看上去都和我期望的一样(即,不覆盖同一对象)。 I've tried ensuring values are loaded properly as their value tokens, INT | 我已经尝试确保值作为其值标记INT |正确加载。 and VARIABLE | 和变量| both load their respective vars properly into classes. 都将它们各自的变量正确加载到类中。

The issue seems to be pinpointed to the expression OPERATION expression statements, when using two of the same type of value the expressions are identical. 当使用两个相同类型的值时,表达式似乎是相同的,因此问题似乎是针对表达式OPERATION表达式语句。 To use an earlier example: 要使用先前的示例:

10 + 55 -> expression PLUS expression -> $1 = 10, $3=10 10 + 55->表达式加表达式-> $ 1 = 10,$ 3 = 10

When the variables are loaded as INT, both are as expected? 当将变量加载为INT时,两者均符合预期吗?

Here's my respective parser.y, as well as the object's i'm trying to load values into. 这是我各自的parser.y,以及我正在尝试将值加载到的对象。

%{
  #include <cstdio>
  #include <iostream>
  #include "TOC/Operation.h"
  #include "TOC/Value.h"
  #include "TOC/Variable.h"
  #include "TOC.h"

  using namespace std;

  extern int yylex();
  extern int yyparse();
  extern FILE *yyin;

  void yyerror(const char *s);
%}

%code requires {
    // This is required to force bison to include TOC before the preprocessing of union types and YYTYPE.
    #include "TOC.h"
}

%union {
  int ival;
  float fval;
  char *vval;
  TOC * toc_T;
}

%token <ival> INT
%token <fval> FLOAT
%token <vval> VARIABLE

%token ENDL PLUS MINUS MUL DIV LPAREN RPAREN

%type <toc_T> expression1
%type <toc_T> expression

%right PLUS MINUS
%right MUL DIV

%start start
%%
start:
        expressions;
expressions:
    expressions expression1 ENDL
    | expression1 ENDL;
expression1:
    expression { 
        TOC* x = $1;
        cout<<x->toTOCStr()<<endl; 
    }; 
expression: 
    expression PLUS expression { 
        TOC *a1 = $1;
        TOC *a2 = $3;
        Operation op(a1, a2, OPS::ADD);
        TOC *t = &op;
        $$ = t;
    }
    |expression MINUS expression { 
        TOC *a1 = $1;
        TOC *a2 = $3;
        Operation op(a1, a2, OPS::SUBTRACT);
        TOC *t = &op;
        $$ = t;    
    }
    |expression MUL expression {
        TOC *a1 = $1;
        TOC *a2 = $3;
        Operation op(a1, a2, OPS::MULTIPLY);
        TOC *t = &op;
        $$ = t;
    }
    |expression DIV expression { 
        TOC *a1 = $1;
        TOC *a2 = $3;
        Operation op(a1, a2, OPS::DIVIDE);
        TOC *t = &op;
        $$ = t;
    }
    |LPAREN expression RPAREN { 
        TOC *t = $2; 
        $$ =  t;
    }
    | INT { 
        Value<int> v = $1;
        TOC *t = &v; 
        $$ =  t;
    }
    | FLOAT { 
        Value<float> v = $1;
        TOC *t = &v;
        $$ = t; 
    }
    | VARIABLE {
        char* name = $1;
        Variable v(name);
        TOC *t = &v;
        $$ = t;
    }
%%

void yyerror(const char *s) {
  cout << "Parser Error:  Message: " << s << endl;
  exit(-1);
}

And the values I'm trying to load (concatenated as one file, for some clarity). 而我正在尝试加载的值(为了清晰起见,被串联为一个文件)。

Operation.h Operation.h

enum OPS {
    SUBTRACT,
    ADD,
    MULTIPLY,
    DIVIDE,
    EXPONENT
};

class Operation : public TOC{

    OPS op;
    public:
        TOC* arg1;
        TOC* arg2;
        Operation(TOC* arg1_in, TOC* arg2_in, OPS operation){
            tt = TOC_TYPES::OPERATION_E;
            arg1 = arg1_in;
            arg2 = arg2_in;
            op = operation;
        };


        std::string toOPType(OPS e){
            switch (e){
                case SUBTRACT:
                    return "-";
                case ADD:
                    return "+";
                case MULTIPLY:
                    return "*";
                case DIVIDE:
                    return "/";
                case EXPONENT:
                    return "^";
                default:
                    return "[Operation Error!]";
            }
        }

        std::string toTOCStr(){
            return arg1->toTOCStr() + toOPType(op) + arg2->toTOCStr();
        }
};

Value.h Value.h

template <class T> class Value : public TOC {
    public:
        T argument;
        Value(T arg){
            tt = TOC_TYPES::VALUE_E;
            argument = arg;
        }

        std::string toTOCStr(){
            std::string x = std::to_string(argument);
            return x;
        }
};

Variable.H Variable.H

class Variable : public TOC {
    public:
        char *name;
        Variable(char* name_in){
            tt = TOC_TYPES::VARIABLE_E;
            name = name_in;
        }
        std::string toTOCStr(){
            std::string x = name;
            return x;
        }
};

TOC.h, in case this is needed TOC.h,如果需要的话

enum TOC_TYPES { 
    VARIABLE_E, 
    VALUE_E,
    OPERATION_E
};

class TOC{
    public:
        TOC_TYPES tt;   
        virtual std::string toTOCStr() = 0;
};

My Main file simply loads in a file and sets yyin to it's contents, before calling yyparse. 我的主文件只是加载到文件中,然后在调用yyparse之前将yyin设置为其内容。 I haven't included it, but can if needsbe (it's not very exciting). 我没有包括它,但是如果需要可以(不是很令人兴奋)。

Ideally, I'd like to load my entire RD parse tree into a TOC*, which I can then iterate down through to generate three op code at each level. 理想情况下,我想将整个RD解析树加载到TOC *中,然后可以向下迭代以在每个级别生成三个操作代码。 This error breaking repeating tokens and operations is really stumping me however. 但是,打破重复标记和操作的错误确实让我感到困扰。

Here's an example of the problem: 这是问题的示例:

    Operation op(a1, a2, OPS::ADD);
    TOC *t = &op;
    $$ = t;

( t is unnecessary; you could just as well have written $$ = &op; . But that's just a side-note.) t是不必要的;您也可以编写$$ = &op;但这只是一个旁注。)

op here is an automatic variable, whose lifetime ends when the block is exited. op这里是一个自动变量,其生存期在退出该块时结束。 And that happens immediately after its address is saved in $$ . 在地址保存为$$后立即发生这种情况。 That makes the semantic value of the production a dangling pointer. 这使得生产的语义价值成为悬空的指针。

Using the address of a variable whose lifetime has ended is Undefined Behaviour, but you can probably guess what is happening: the next time the block is entered, the stack is at the same place and the new op has the same address as the old one. 使用一个寿命已结束的变量的地址为Undefined Behaviour,但您可能可以猜测正在发生的情况:下一次进入该块时,堆栈位于同一位置,新的op与旧的op具有相同的地址。 (There's no guarantee that that will happen: undefined behaviour is undefined by definition. But this particular result is consistent with your observation.) (不能保证会发生这种情况:未定义的行为是未定义的。但是,此特定结果与您的观察一致。)

In short, get cosy with the new operator: 简而言之,请熟悉new操作符:

$$ = new Operation(a1, a2, OPS::ADD);

And don't forget to delete it at an appropriate moment. 并且不要忘记在适当的时候delete它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM