简体   繁体   English

如何编写上下文无关文法?

[英]How to program a context-free grammar?

I have two classes here.我这里有两节课。

The CFG class takes a string array in its constructor that defines the context-free grammar. CFG 类在其构造函数中采用一个字符串数组来定义上下文无关文法。 The SampleTest class is being used to test the CFG class by inputting the grammar (C) into the class, then inputting a string by the user, and seeing if that string can be generated by the context-free grammar. SampleTest 类用于测试 CFG 类,方法是将语法 (C) 输入到类中,然后由用户输入一个字符串,然后查看该字符串是否可以由上下文无关语法生成。

The problem I'm running into is a stack overflow (obviously).我遇到的问题是堆栈溢出(显然)。 I'm assuming that I just created a never-ending recursive function.我假设我刚刚创建了一个永无止境的递归函数。

Could someone take a look at the processData() function, and help me out figure out how to correctly configure it.有人可以看看 processData() 函数,并帮助我弄清楚如何正确配置它。 I'm basically using recursion to take generate all possibilities for strings that the CFG can create, then returning true if one of those possibilities being generated matches the user's input (inString).我基本上使用递归来生成 CFG 可以创建的字符串的所有可能性,然后如果生成的这些可能性之一与用户的输入 (inString) 匹配,则返回 true。 Oh, and the wkString parameter is simply the string being generated by the grammar through each recursive iteration.哦,wkString 参数只是语法通过每次递归迭代生成的字符串。

public class SampleTest {
  public static void main(String[] args) {
    // Language: strings that contain 0+ b's, followed by 2+ a's,
    // followed by 1 b, and ending with 2+ a's.
    String[] C = { "S=>bS", "S=>aaT", "T=>aT", "T=>bU", "U=>Ua", "U=>aa" };
    String inString, startWkString;
    boolean accept1;
    CFG CFG1 = new CFG(C);
    if (args.length >= 1) {
      // Input string is command line parameter
      inString = args[0];
      char[] startNonTerm = new char[1];
      startNonTerm[0] = CFG1.getStartNT();
      startWkString = new String(startNonTerm);
      accept1 = CFG1.processData(inString, startWkString);
      System.out.println(" Accept String? " + accept1);
    }
  } // end main
} // end class


public class CFG {

  private String[] code;
  private char startNT;

  CFG(String[] c) {
    this.code = c;
    setStartNT(c[0].charAt(0));
  }

  void setStartNT(char startNT) {
    this.startNT = startNT;
  }

  char getStartNT() {
    return this.startNT;
  }

  boolean processData(String inString, String wkString) {
    if (inString.equals(wkString)) {
      return true;
    } else if (wkString.length() > inString.length()) {
      return false;
    }

    // search for non-terminal in the working string
    boolean containsNT = false;
    for (int i = 0; i < wkString.length(); i++) {
      // if one of the characters in the working string is a non-terminal
      if (Character.isUpperCase(wkString.charAt(i))) {
        // mark containsNT as true, and exit the for loop
        containsNT = true;
        break;
      }
    }
    // if there isn't a non-terminal in the working string
    if (containsNT == false) {
      return false;
    }

    // for each production rule
    for (int i = 0; i < this.code.length; i++) {
      // for each character on the RHS of the production rule
      for (int j = 0; j <= this.code[i].length() - 3; j++) {
        if (Character.isUpperCase(this.code[i].charAt(j))) {
          // make substitution for non-terminal, creating a new working string
          String newWk = wkString.replaceFirst(Character.toString(this.code[i].charAt(0)), this.code[i].substring(3));
          if (processData(inString, newWk) == true) {
            return true;
          }
        }

      }
    } // end for loop
    return false;
  } // end processData
} // end class

Your grammar contains a left-recursive rule您的语法包含左递归规则

U=>Ua

Recursive-descent parsers can't handle left-recursion, as you've just discovered.正如您刚刚发现的那样,递归下降解析器无法处理左递归。

You have two options: Alter your grammar to not be left-recursive anymore, or use a parsing algorithm that can handle it, such as LR1.你有两个选择:改变你的语法不再是左递归,或者使用可以处理它的解析算法,比如 LR1。 In your case, U is matching "at least two a characters", so we can just move the recursion to the right.在您的情况下, U匹配“至少两个a字符”,因此我们可以将递归向右移动。

U=>aU

and everything will be fine.一切都会好起来的。 This isn't always possible to do in such a nice way, but in your case, avoiding left-recursion is the easy solution.这并不总是可以以如此好的方式完成,但在您的情况下,避免左递归是简单的解决方案。

You don't need this for loop: "for (int j = 0; j <= this.code[i].length() - 3; j++)".你不需要这个 for 循环:“for (int j = 0; j <= this.code[i].length() - 3; j++)”。 jus create a var to hold the Capital letter in the nonterminal search you did above.只需创建一个 var 来保存您在上面所做的非终结符搜索中的大写字母。 Then do your outer for loop followed by if there is a production rule in String[] that starts with that found Non-terminal, do your substitution and recursion.然后做你的外部 for 循环,然后如果 String[] 中有一个以找到的非终端开头的生产规则,做你的替换和递归。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM