简体   繁体   English

上下文无关语法中的错误概率计算(NLTK,Python 3)

[英]Wrong probability calculation in context-free grammar (NLTK, Python 3)

I have a problem with showing the most likely constituency structure of some sentence using NLTK's probabilistic grammar.我在使用 NLTK 的概率语法显示某些句子的最可能的选区结构时遇到问题。

Here is my sentence "Ich sah den Tiger under der Felse"这是我的句子“Ich sah den Tiger under der Felse”

Here is my code:这是我的代码:

from nltk import PCFG
tiger_grammar = PCFG.fromstring("""
S -> NP VP [1.0]
NP -> ART NN [0.25] | PPER [0.5] | NP PP [0.25]
VP -> VVFIN NP [0.75] | VVFIN NP PP [0.25]
PP -> APPR NP [1.0]
APPR -> 'unter' [1.0]
PPER -> 'Ich' [1.0]
VVFIN -> 'sah' [1.0]
NN -> 'Tiger' [0.5] | 'Felse' [0.5]
ART -> 'den' [0.5] | 'der' [0.5]
""")
viterbi_parser = nltk.ViterbiParser(tiger_grammar)
trees = viterbi_parser.parse(['Ich', 'sah', 'den', 'Tiger', 'unter', 'der', 'Felse'])
for t in trees:
    print(t)

Here is what I get:这是我得到的:

(S
  (NP (PPER Ich))
  (VP
    (VVFIN sah)
    (NP (ART den) (NN Tiger))
    (PP (APPR unter) (NP (ART der) (NN Felse))))) (p=0.000488281)

But the desired result is:但想要的结果是:

(S
  (NP (PPER Ich))
  (VP
    (VVFIN sah)
    (NP
      (NP (ART den) (NN Tiger))
      (PP (APPR unter) (NP (ART der) (NN Felse))))))

(I didn't add the probability here, but it should be displayed as well) (我这里没有添加概率,但也应该显示出来)

According to the grammar, the probability to form VP from VVFIN and NP is higher than from VVFIN , NP and PP .根据语法,从VVFINNP形成VP的概率高于从VVFINNPPP But the parser shows the second structure.但是解析器显示了第二种结构。

What am I doing wrong?我究竟做错了什么?

Would be grateful for suggestions!将不胜感激的建议!

Simply because your desired result has lower probability then the result you got.仅仅是因为您想要的结果的概率低于您得到的结果。 We can compute the probability of your desired result:我们可以计算出您想要的结果的概率:

S -> NP VP       1.0

NP -> PPER       0.5
PPER -> Ich      1.0

VP -> VVFIN NP   0.75
VVFIN -> sah     1.0
NP -> NP PP      0.25

NP -> ART NN     0.25
ART -> den       0.5
NN -> Tiger      0.5

PP -> APPR NP    1.0
APPR -> unter    1.0

NP -> ART NN     0.25
ART -> der       0.5
NN -> Felse      0.5

Multiplied together gets probability 0.0003662109375 , which is definitely less than the result you got 0.000488281 .相乘得到概率0.0003662109375 ,这绝对小于你得到的结果0.000488281

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM