简体   繁体   English

自上而下的解析-Java

[英]Top Down Parsing - Java

I have been assigned a project to implement a top-down backtracking parser for any grammar which contains only one nonterminal on the RHS of its rewrite rules (eg S -> aaSb | aaSa | aSa) 我被分配了一个项目来实现任何语法的自上而下的回溯解析器,该语法在其RHS的重写规则中仅包含一个非终结符(例如S-> aaSb | aaSa | aSa)

So far, I have three methods, including main , that are used to handle checking the validity of an input string. 到目前为止,我有三种方法,包括main ,用于处理检查输入字符串的有效性。

My goal is to, using a char[][] array for the grammar, check each character in the input string against the grammar, and return true if the string is contained within the grammar. 我的目标是为语法使用char[][]数组,对照语法检查输入字符串中的每个字符,如果字符串包含在语法中,则返回true

public class TDBP {
    public static void main(String[] args) {
        char[][] g = new char[][] 
            { {'a', 'a', 'S', 'b'}, 
              {'a', 'a', 'S', 'a'}, 
              {'a', 'S', 'a'}, 
              {'\0'} };

        SP(g);
    }
    public static void SP(char[][] g) {
        Scanner s = new Scanner(System.in);
        boolean again = true; int pn = 0;
        String test;

        while(again) {
            System.out.print("Next string? ");
            test = s.nextLine();

            if(S(pn, test, g))
                System.out.println("String is in the language");
            else
                System.out.println("String is not in the language");

            if(s.nextLine() == "\n") again = false;
        }

        s.close();
    }
    public static boolean S(int pn, String test, char[][] g) {
        char[] c = test.toCharArray();
        boolean exists = false;

        for(int i = pn; i < g.length; i++) {
            for(int j = 0; j < g[i].length; j++) {
                if(c[j] == 'S')
                    S(++pn, test, g);
                if(c[j] == g[i][j])
                    exists = true;
            }
        }

        return exists;
    }
}

In my algorithm, pn is an integer to keep track of which production in the grammar I am currently looking at, and to make sure that I don't scan the same grammar twice (eg a pn of 1 in above grammar would correspond to aaSa ). 在我的算法中, pn是一个整数,用于跟踪我当前正在查看的语法中的哪个产生式,并确保我不会两次扫描相同的语法(例如,上述语法中的pn为1对应于aaSa )。 Also, I have \\0 represent the empty string. 另外,我有\\0代表空字符串。

Am I parsing the string correctly? 我是否正确解析了字符串?

Thank you! 谢谢!

This can be solved in a more easier way using the recursive descent top down parsing. 使用递归下降自上而下的解析可以更轻松地解决此问题。

Suppose your grammer is 假设您的语法是

S -> aaSb | S-> aaSb | aaSa | aaSa | aSa | aSa | # where # represents the empty string. #其中#代表空字符串。

After the left factoring, it will be something like 经过左分解后,将会是

S  -> aS' | #
S' -> aSS" | Sa
S" -> b | a

Then define each rule in separate method and call recursively as given below. 然后在单独的方法中定义每个规则,并按如下所示递归调用。

For the first rule: S -> aS' | 对于第一个规则:S-> aS'| #

function S(char[] input, int &i) {
     if(input[i]=='a') {
       i++;
       return S1(input, i);
      } 
      return true;   //For empty string
}

For the second rule: S' -> aSS" | Sa 对于第二条规则:S'-> aSS“ | Sa

function s1(char[] input, int &i) {
    if(input[i]=='a') {
       i++;
       S(input, i);
       return S2(input, i);
    } else {
       S(input, i);
       if(input[i]=='a') {
          return true;
       } else {
          return false;
       }
    }
}

Like this define the third function also.(Note that i must be pass by reference) 像这样定义第三个函数。(请注意,我必须通过引用传递)

You can refer any of the top down parsing tutorials for more details. 您可以参考任何自上而下的解析教程以获取更多详细信息。 U can refer this one also. 你也可以参考这个

Hope this will help. 希望这会有所帮助。

I am bit rusty on my CS classes: but the following code worked for me: 我对CS类感到有些生疏:但是以下代码对我有用:

public static boolean fullCheck(char[] toTest, char[][] g) {
    int val = checkOnAll(0, toTest, g);

    if (val == toTest.length) {
        return true;
    }
    return false;
}

public static int checkOnAll(int charPos, char[] toTest, char[][] g) {
    for(int i = 0; i < g.length; i++) {
        int val = checkOne(charPos, toTest, g[i], g);
        if (val != -1) {
            return val;
        }
    }
    return -1;
}

//also needs length checks
public static int checkOne(int charPos, char[] toTest, char[] rule, char[][] rules) {
    for(int j = 0; j < rule.length; j++) {
        if (charPos >= toTest.length) {
            return -1;
        }
        if(rule[j] == 'S') {
            int value = checkOnAll(charPos, toTest, rules);
            if (value == -1) {
                return -1;
            } else {
                charPos = value - 1;
            }
        } else if (toTest[charPos] != rule[j]) {
            return -1;
        }
        charPos++;
    }
    return charPos;
}

Instead of the "S" function, use the fullCheck one (give the input as a char array to the new method). 代替“ S”功能,使用fullCheck一项(将输入作为新方法的char数组提供)。 I also changed the "g" array to be: 我也将“ g”数组更改为:

        char[][] g = new char[][]
            { {'a', 'a', 'S', 'b'},
              {'a', 'a', 'S', 'a'},
              {'a', 'S', 'a'},
              {} };

(the "\\0" gave me trouble with the length checks, the above change was the simplest solution). (“ \\ 0”使我无法进行长度检查,上述更改是最简单的解决方案)。

I found a few issues in your code, and although I am not totally sure that my own code is bug-free, I thought I'll share anyways: 1. when S returns a "false" inside your recursion, but the value is ignored. 我在您的代码中发现了一些问题,尽管我不能完全确定自己的代码是否没有错误,但我还是想共享一下:1.当S在递归中返回“ false”时,值是忽略。 2. the "pn" should be restricted to knowing which the test char we are on not the rule char. 2.“ pn”应仅限于知道我们所使用的测试字符而不是规则字符。 3. even if the value returned is true, you must make sure you checked the entire test string and not just part of it. 3.即使返回的值为true,也必须确保检查了整个测试字符串,而不仅仅是一部分。 I did not see you do that. 我没看到你那样做。 4. if you have one very long rule but a short input, you might get an array out of bounds exception since you never look at the test string length. 4.如果您有一个很长的规则但输入很短,则可能会遇到数组超出范围的异常,因为您从不看测试字符串的长度。

I tried my own code with various inputs, and I have a feeling like I might have missed something, but I could not find it. 我用各种输入尝试了自己的代码,感觉好像错过了一些东西,但找不到。 if you find a problem, please let me know :) 如果您发现问题,请告诉我:)

good luck. 祝好运。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM