根据子列表的前两项过滤列表列表-使用NLTK进行自然语言处理

Question

I have generated a list of trigrams and their frequencies in NLTK with this code 我用此代码在NLTK中生成了三字母组及其频率列表

tokens = nltk.wordpunct_tokenize(docs)
from nltk.collocations import *
trigram_measures = nltk.collocations.TrigramAssocMeasures()
finderT = TrigramCollocationFinder.from_words(tokens)
scoredT = finderT.score_ngrams(trigram_measures.raw_freq)

Given a user defined 'input' of two words, I want to filer the list scoredT to return those values where the input matches the first two items of the sub list in scoredT 给定用户定义的两个单词的“输入”，我想对列表scoreT进行过滤，以返回输入与scoredT中子列表的前两项匹配的值

scoredT looks like this scoredT看起来像这样

[(('out', 'to', 'the'), 2.7147650642313413e-05),
(('proud', 'of', 'you'), 2.7147650642313413e-05)]

So if input were equal to 'out to', Id like to filter the list to return 'the' 因此，如果输入等于“ out to”，则想过滤列表以返回“ the”

I tried 我试过了

matches = filter(scoredT[0:len(scoredT)][0:1]==input, scoredT)

but get the following error TypeError: 'bool' object is not callable 但出现以下错误TypeError：'bool'对象不可调用

Answer 1

scoredT[0:len(scoredT)][0:1]==input compares the first element of scoredT to input . scoredT[0:len(scoredT)][0:1]==input的第一个元素进行比较scoredT到input 。 So it will be boolean. 因此它将是布尔值。 Then you pass it to filter , which requires the first argument to be a boolean valued function , hence your error. 然后将其传递给filter ，它要求第一个参数是布尔值函数 ，因此会出错。 The pythonic way: pythonic方式：

matches = [(trigram, score) for (trigram, score) in scoredT if trigram[:2] == input]

Also you need to make sure that input is a tuple. 另外，您还需要确保input是一个元组。

根据子列表的前两项过滤列表列表-使用NLTK进行自然语言处理

问题描述

1 个解决方案

解决方案1
0 已采纳 2016-03-30 22:26:01

根据子列表的前两项过滤列表列表-使用NLTK进行自然语言处理

问题描述

1 个解决方案

解决方案1 0 已采纳 2016-03-30 22:26:01

解决方案1
0 已采纳 2016-03-30 22:26:01