繁体   English   中英

这些用于检测有限语法歧义的Python程序是否正确?

[英]Are these Python programs for detecting the ambiguity of a finite grammar correct?

我一直在做Udacity CS262,对于检测模糊问题,我不确定我的解决方案是否正确,我不确定“官方”解决方案是否正确。

问题的简要描述:编写一个函数isambig(语法,开始,字符串),它采用有限的上下文无关语法(编码为python字典),语法的起始符号和字符串。 如果有两个解析树导致字符串,那么语法是模糊的(或者至少这是我对歧义的理解 - 如果我弄错了,请纠正我)。 如果语法不明确,则返回True。 否则返回False。

测试用例:

grammar1 = [
       ("S", [ "P", ]),
       ("S", [ "a", "Q", ]) ,
       ("P", [ "a", "T"]),
       ("P", [ "c" ]),
       ("Q", [ "b" ]),
       ("T", [ "b" ]),
       ]
print isambig(grammar1, "S", ["a", "b"]) == True
print isambig(grammar1, "S", ["c"]) == False

grammar2 = [
       ("A", [ "B", ]),
       ("B", [ "C", ]),
       ("C", [ "D", ]),
       ("D", [ "E", ]),
       ("E", [ "F", ]),
       ("E", [ "G", ]),
       ("E", [ "x", "H", ]),
       ("F", [ "x", "H"]),
       ("G", [ "x", "H"]),
       ("H", [ "y", ]),
       ]
print isambig(grammar2, "A", ["x", "y"]) == True
print isambig(grammar2, "E", ["y"]) == False

grammar3 = [ # Rivers in Kenya
       ("A", [ "B", "C"]),
       ("A", [ "D", ]),
       ("B", [ "Dawa", ]),
       ("C", [ "Gucha", ]),
       ("D", [ "B", "Gucha"]),
       ("A", [ "E", "Mbagathi"]),
       ("A", [ "F", "Nairobi"]),
       ("E", [ "Tsavo" ]),
       ("F", [ "Dawa", "Gucha" ])
       ]
print isambig(grammar3, "A", ["Dawa", "Gucha"]) == True
print isambig(grammar3, "A", ["Dawa", "Gucha", "Nairobi"]) == False
print isambig(grammar3, "A", ["Tsavo"]) == False

我添加了自己的测试用例。 我不确定这是否正确,但我只能看到可能的一个解析树导致字符串“ab”,因此该字符串不能证明语法不明确。 而且我不相信语法是模棱两可的。

grammar4 = [ # Simple test case
       ("S", [ "P", "Q"]),
       ("P", [ "a", ]),
       ("Q", [ "b", ]),
       ]
print isambig(grammar4, "S", ["a", "b"]) == False

这是“官方”计划:

def expand(tokens_and_derivation, grammar):
    (tokens,derivation) = tokens_and_derivation
    for token_pos in range(len(tokens)):
        for rule_index in range(len(grammar)):
            rule = grammar[rule_index]
            if tokens[token_pos] == rule[0]:
                yield ((tokens[0:token_pos] + rule[1] + tokens[token_pos+1:]), derivation + [rule_index])

def isambig(grammar, start, utterance):
    enumerated = [([start], [])]
    while True:
        new_enumerated = enumerated
        for u in enumerated:
            for i in expand(u,grammar):
                if not i in new_enumerated:
                    new_enumerated = new_enumerated + [i]

        if new_enumerated != enumerated:
            enumerated = new_enumerated
        else:
            break
    result = [xrange for xrange in enumerated if xrange[0] == utterance]
    print result
    return len(result) > 1

这是我自己的,更长的计划:

def expand(grammar, symbol):
    result = []
    for rule in grammar:
        if rule[0] == symbol:
            result.append(rule[1])
    return result

def expand_first_nonterminal(grammar, string):
    result = []
    for i in xrange(len(string)):
        if isterminal(grammar, string[i]) == False:
            for j in expand(grammar, string[i]):
                result.append(string[:i]+j+string[i+1:])
            return result
    return None

def full_expand_string(grammar,string, result):
    for i in expand_first_nonterminal(grammar,string):
        if allterminals(grammar,i):
            result.append(i)
        else:
            full_expand_string(grammar,i,result)

def isterminal(grammar,symbol):
    for rule in grammar:
        if rule[0] == symbol:
            return False
    return True

def allterminals(grammar,string):
    for symbol in string:
        if isterminal(grammar,symbol) == False:
            return False
    return True

def returnall(grammar, start):
    result = []
    for rule in grammar:
        if rule[0] == start:
            if allterminals(grammar,rule[1]):
                return rule[1]
            else:
                full_expand_string(grammar, rule[1], result)
    return result

def isambig(grammar, start, utterance):
    count = 0
    for i in returnall(grammar,start):
        if i == utterance:
            count+=1
    if count > 1:
        return True
    else:
        return False

现在,我的程序通过了所有测试用例,包括我添加的测试用例(grammar4),但官方解决方案通过了除我添加的测试用例之外的所有测试用例。 在我看来,测试用例是错误的,或官方解决方案是错误的。

官方解决方案是否正确? 我的解决方案是否正确

对我来说,看起来grammar4并不含糊。 只有一个解析树:

S -> PQ
P -> a
Q -> b

    S
    |
 ___|____
P        Q
|        |
a        b

然而,官方程序说这是模棱两可的,因为它连续使用规则P -> aQ -> b

[(['a', 'b'], [0, 1, 2]), (['a', 'b'], [0, 2, 1])]

(现在有两个规则序列0,1,20,2,1 。)

所以“官方”程序似乎错误地检测到grammar4错误。

更新:我查看了你的代码,并做了一些测试,除了不处理递归(官方版本也不处理递归),你的程序似乎正确区分模糊和明确。

简单测试:

grammar5 = [ 
             ("S", ["A", "B"]),
             ("S", ["B", "A"]),
             ("A", ["a"]),
             ("B", ["a"]),
           ]   
print(isambig(grammar5, "S", ["a", "a"]))

S -> AB
S -> BA
A -> a
B -> a

    S
    |
 ___|____
A        B
|        |
a        a

    S
    |
 ___|____
B        A
|        |
a        a

您的版本返回“模糊”(与“官方”版本一样)。

如果你删除("S", ["B", "A"]) ,你的版本正确切换到“不明确”,而另一个版本仍然返回“模糊”(我们回到语法4的情况。)

也许其他人(比我有更多的经验)可以加入。

更新2: Ira Baxter提到,无上下文语法是否含糊不清是一个不可判定的问题。

另请参见如何证明无上下文语言不明确是不明确的?

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM