简体   繁体   English

你能帮我找出我的词法分析器错误的原因:无效符号吗?

[英]Would you please help me find the cause for my Lexical Analyzer Error: Invalid Symbol?

Okay, so the issue is that I don't know why I'm getting this error.好的,所以问题是我不知道为什么会出现此错误。

For a class, we're writing a compiler piece by piece.对于一个类,我们正在一块一块地编写一个编译器。 This bit of code is supposed to tokenize input symbols.这段代码应该用来标记输入符号。 I wrote a series of if/else statements that act like a very simple trie, thinking that it would be able to find all the symbols.我写了一系列 if/else 语句,就像一个非常简单的 Trie,认为它可以找到所有的符号。 It works fine for some of them, but is getting stuck on "<>".它对其中一些工作正常,但卡在“<>”上。

Here's the whole function:这是整个功能:

// Process the symbols
void symbol_processor(char *input)
{
    // Initialize symbol_type operator
    int symbol_type = -1;

    printf("Location1: %d\n", input_index);

    // A series of if/else that ape a trie
    if (input[input_index] == '=')
        if (input[input_index + 1] == '=')
        {
            printf("Location2: %d\n", input_index);
            // Set the symbol for "=="
            symbol_type = eqlsym;

            // Move forward two input_index spaces
            input_index += 2;
            printf("Location4: %d\n", input_index);
        }
    else if (input[input_index] == '<')
        if (input[input_index + 1] == '>')
        {
            printf("Location4: %d\n", input_index);
            // Set the symbol for "<>"
            symbol_type = neqsym;

            // Move forward two input_index spaces
            input_index += 2;
        }
        else if (input[input_index + 1] == '=')
        {
            // Set the symbol for "<="
            symbol_type = leqsym;

            // Move forward two input_index spaces
            input_index += 2;
        }
        else
        {
            printf("Location: %d\n", input_index);
            // Set the symbol for "<"
            symbol_type = lessym;

            // Move forward one input_index space
            input_index++;
        }
    else if (input[input_index] == '>')
        if (input[input_index + 1] == '=')
        {
            // Set the symbol for ">="
            symbol_type = geqsym;

            // Move forward two input_index spaces
            input_index += 2;
        }
        else
        {
            // Set the symbol for ">"
            symbol_type = modsym;

            // Move forward one index space
            input_index++;
        }
    else if (input[input_index] == ':')
        if (input[input_index + 1] == '=')
        {
            // Set the symbol for ":="
            symbol_type = becomessym;

            // Move forward two input_index spaces
            input_index += 2;
        }
    // This could cause an issue
    else if (input[input_index] == '/')
        if (input[input_index + 1] == '*')
            comment_error = comment_processor(input);
        else
        {
            
            // Set the symbol for ">"
            symbol_type = slashsym;

            // Move forward one index space
            input_index++;
        }
    else if (input[input_index] == '%')
    {
        // Set the symbol for "%"
        symbol_type = modsym;

        // Move forward one index space
        input_index++;
    }
    else if (input[input_index] == '*')
    {
        // Set the symbol for "*"
        symbol_type = multsym;

        // Move forward one index space
        input_index++;
    }
    else if (input[input_index] == '+')
    {
        // Set the symbol for "+"
        symbol_type = plussym;

        // Move forward one index space
        input_index++;
    }
    else if (input[input_index] == '-')
    {
        // Set the symbol for "-"
        symbol_type = minussym;

        // Move forward one index space
        input_index++;
    }
    else if (input[input_index] == '(')
    {
        // Set the symbol for "("
        symbol_type = lparentsym;

        // Move forward one index space
        input_index++;
    }
    else if (input[input_index] == ')')
    {
        // Set the symbol for ")"
        symbol_type = rparentsym;

        // Move forward one input_index space
        input_index++;
    }
    else if (input[input_index] == ',')
    {
        // Set the symbol for ","
        symbol_type = commasym;

        // Move forward one index space
        input_index++;
    }
    else if (input[input_index] == '.')
    {
        // Set the symbol for "."
        symbol_type = periodsym;

        // Move forward one input_index space
        input_index++;
    }
    else if (input[input_index] == ';')
    {
        // Set the symbol for ";"
        symbol_type = semicolonsym;

        // Move forward one index space
        input_index++;
    }

    // Check to see if an error should be thrown
    if (symbol_type == -1)
        error_processor(1); // Invalid Symbol

    // Append symbol to the list
    list[lex_index].type = symbol_type;
    lex_index++;
}

But I'm pretty sure the problem is in here:但我很确定问题出在这里:

else if (input[input_index] == '<')
        if (input[input_index + 1] == '>')
        {
            printf("Location4: %d\n", input_index);
            // Set the symbol for "<>"
            symbol_type = neqsym;

            // Move forward two input_index spaces
            input_index += 2;
        }
        else if (input[input_index + 1] == '=')
        {
            // Set the symbol for "<="
            symbol_type = leqsym;

            // Move forward two input_index spaces
            input_index += 2;
        }
        else
        {
            printf("Location: %d\n", input_index);
            // Set the symbol for "<"
            symbol_type = lessym;

            // Move forward one input_index space
            input_index++;
        }

I just can't see what the issue is, and was hoping that programmers more wise and experienced than myself could point it out.我只是看不出问题是什么,并希望比我更聪明、更有经验的程序员可以指出它。 Also, just ignore the printf statements.此外,只需忽略 printf 语句。 I was using those to try and help debug.我正在使用它们来尝试帮助调试。

Here's the entire input text I'm feeding in, if it helps.这是我输入的整个输入文本,如果有帮助的话。 The error gets thrown at the '<' that comes right after "var".在“var”之后的 '<' 处抛出错误。

const==var<>procedureend<=if>=then.else;while(do)call:=read,write+124-jalapeno*/ comment // const==var<>procedureend<=if>=then.else;while(do)call:=read,write+124-jalapeno*/ comment //

You only accept symbols that start with '=', because there is no matching else statement to your first if.您只接受以“=”开头的符号,因为没有与您的第一个 if 匹配的 else 语句。 Your indenting makes it look like there is;你的缩进使它看起来像有; but if you run your program through indent, you will see where you went wrong.但是如果你通过缩进来运行你的程序,你会发现你哪里出错了。 This sequence:这个顺序:

if (input[input_index] == '=')
    if (input[input_index + 1] == '=')
    {
        printf("Location2: %d\n", input_index);
        // Set the symbol for "=="
        symbol_type = eqlsym;

        // Move forward two input_index spaces
        input_index += 2;
        printf("Location4: %d\n", input_index);
    }
else if (input[input_index] == '<')
    if (input[input_index + 1] == '>')

is actually:实际上是:

if (input[input_index] == '=')
    if (input[input_index + 1] == '=')
    {
        printf("Location2: %d\n", input_index);
        // Set the symbol for "=="
        symbol_type = eqlsym;

        // Move forward two input_index spaces
        input_index += 2;
        printf("Location4: %d\n", input_index);
    } else if (input[input_index] == '<')
        if (input[input_index + 1] == '>')

It is generally more idiomatic C to use a switch statement, particularly for the outer most condition, so the logic looks more like:使用 switch 语句通常是更惯用的 C 语言,特别是对于最外层的条件,所以逻辑看起来更像:

switch (input[input_index]) {
    case '=':
        if (input[input_index + 1] == '=')
        {
        }
        break;
    case '<':
        if (input[input_index + 1] == '>')
        {
        }
        else if (input[input_index + 1] == '=')
        {
        }
        else
        {
        }
        break;
    case '>':
        if (input[input_index + 1] == '=')
        {
        }
        else
        {
        }
        break;

It is quite a bit easier on the eyes of the reader than trying to figure out if every one of those if (input[input_index] == ... is using the same array and index, etc..在读者看来,这比试图弄清楚if (input[input_index] == ...是否使用相同的数组和索引等要if (input[input_index] == ...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM