简体   繁体   English

语法nltk在Python中的列表

[英]Grammar nltk for list in Python

I have to create a grammar nltk for a list in python . 我必须为pythonlist创建一个语法nltk I have this grammar for a text: 我的文字语法如下:

grammar1 = nltk.CFG.fromstring("""
    S -> NP VP
    VP -> V NP | V NP PP
    PP -> P NP
    V -> "saw" | "ate" | "walked"
    NP -> "John" | "Mary" | "Bob" | Det N | Det N PP
    Det -> "a" | "an" | "the" | "my"
    N -> "man" | "dog" | "cat" | "telescope" | "kitchen"
    P -> "in" | "on" | "by" | "with"
    """)

sent = "the cat ate a telescope in the kitchen".split()
rd_parser = nltk.RecursiveDescentParser(grammar1)

for tree in rd_parser.parse(sent):
    print(tree)

Now, how can I do the same for a list ? 现在,如何为list做同样的事情? I need to test legal and illegal list with a basic grammar. 我需要使用基本语法来测试合法list和非法list I didn't find any intel about a nltk and lists and I don't really understand how can I do that... 我没有找到关于nltk和列表的任何信息,我真的不明白该怎么做...

Notice that the following code line already creates a list (of strings). 请注意,以下代码行已经创建了一个(字符串)列表。

sent = "the cat ate a telescope in the kitchen".split()

You have also created a recursive descent parser for your grammar using the following line. 您还使用以下行为语法创建了递归下降解析器。 Note that you only need to do this once. 请注意,您只需要执行一次。

rd_parser = nltk.RecursiveDescentParser(grammar1)

Now, if you want to test a different list of tokens, simply do something like this: 现在,如果要测试其他令牌列表,只需执行以下操作:

L = ["John", "walked", "the", "dog"]
result = rd_parser.parse(L)

You have a parser that can be applied to lists of tokens. 您有一个可应用于标记列表的解析器。 You have a collection of test materials in different formats. 您有各种格式的测试材料的集合。 Quoting from your comment : "empty list, list with one token, list with several tokens, list with numbers, tuple, and dictionnary." 引用您的评论 :“空列表,带有一个标记的列表,带有多个标记的列表,带有数字,元组和字典的列表。”

The parser can handle "sequences" of strings, which in your case means a list or tuple whose elements are strings (and each string is a word). 解析器可以处理字符串的“序列”,在您的情况下,这意味着列表或元组的元素是字符串(每个字符串是一个单词)。 The parser cannot handle anything else; 解析器无法处理其他任何事情。 if your code has to deal with other types, write python code to check their type before the parser sees them. 如果您的代码必须处理其他类型,请在解析器看到它们之前编写python代码以检查其类型。

You'll be interested in the built-in functions isinstance() (preferred) and type() . 您将对内置函数isinstance() (首选)和type()感兴趣。 Eg, 例如,

if (isinstance(sent, (tuple, list)) and all(isinstance(w, str) for w in sent)):
    # A tuple or list of strings; try to parse it.
    trees = rd_parser.parse(sent)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM