简体   繁体   中英

How to build trie tree to solve this parse algorithm

I am trying to using trie tree to solve this problem:

Symbol string generator consists of two parts, a set of the start symbol and a set of rules of generation.
For example:
Start symbol: ['S'], Rules of generation: ["S → abc", "S → aA", "A → b", "A → c"]
Then, symbolic string abc can be generated because S → abc. 
Symbolic string ab can be generated because S → aA → ab.
Symbolic string abc can be generated because S → aA → ac.
Now, give you a symbolic string generator and a symbolic string, and you need to return True if the symbolic string can be generated, False otherwise

Example
Given generator = ["S -> abcd", "S -> Ad", "A -> ab", "A -> c"], startSymbol = S, symbolString = “abd”, return True.

explanation:
S → Ad → abd

Given generator = ["S → abc", "S → aA", "A → b", "A → c"], startSymbol = S, symbolString = “a”, return False

I find the key point for this problem is building a trie tree. And I was trying to write:

def build_trie(values): #value is like ['abc', 'Ad'...]
    root = {}
    for word in values:
        current = root
        is_end = False
        for c in word:
            if 'A' <= c <= 'Z':
                vals = m[c] #m is a mapping of {'S': ['abc', 'Ad'], ...}
                rs = build_trie(vals)
                for k in rs:
                    if k not in current:
                        current[k] = rs[k]
                    else:
                        # stuck here...
                        pass

                        # temp = collections.defaultdict(dict)
                        # for d in (current[k], rs[k]):
                        #     for k, v in d.items():
                        #         if k in temp and k != '__end__':
                        #             temp[k].update(v)
                        #         else:
                        #             temp[k] = v
                        # # current[k].update(rs[k])
                        # current[k] = temp[k]
                is_end = True
            else:
                current = current.setdefault(c, {})
                is_end = False
        if not is_end:
            current['__end__'] = '__end__'
    return root

but got stuck on the else part... Have not figure out how to write this trie tree. Any clue?

There are multiple parser libraries in python you may want to use. I have used LARK parser . They have given a comparison of various python parsers.

During my college days I have implemented a LALR(1) parser in C. I guess it will be of less use. I found an useful implementation in python here , if you wanted to write the entire parser again. I haven't tested the working of that code.

For the given grammar, I have written a validator using LARK as below.

from lark import Lark
import sys

grammar = """
        start: "abcd"
         | A "d"
        A: "ab"
         | "c"
        """

parser = Lark(grammar)

def check_grammer(word):
    try:
            parser.parse(word)
            return True
    except Exception as exception:
            print exception
            return False



word = sys.argv[1]
print check_grammer(word)

Hope it helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM