[英]Filter a list of lists based on the first two items of the sublist - natural language processing with NLTK
I have generated a list of trigrams and their frequencies in NLTK with this code 我用此代码在NLTK中生成了三字母组及其频率列表
tokens = nltk.wordpunct_tokenize(docs)
from nltk.collocations import *
trigram_measures = nltk.collocations.TrigramAssocMeasures()
finderT = TrigramCollocationFinder.from_words(tokens)
scoredT = finderT.score_ngrams(trigram_measures.raw_freq)
Given a user defined 'input' of two words, I want to filer the list scoredT to return those values where the input matches the first two items of the sub list in scoredT 给定用户定义的两个单词的“输入”,我想对列表scoreT进行过滤,以返回输入与scoredT中子列表的前两项匹配的值
scoredT looks like this scoredT看起来像这样
[(('out', 'to', 'the'), 2.7147650642313413e-05),
(('proud', 'of', 'you'), 2.7147650642313413e-05)]
So if input were equal to 'out to', Id like to filter the list to return 'the' 因此,如果输入等于“ out to”,则想过滤列表以返回“ the”
I tried 我试过了
matches = filter(scoredT[0:len(scoredT)][0:1]==input, scoredT)
but get the following error TypeError: 'bool' object is not callable 但出现以下错误TypeError:'bool'对象不可调用
scoredT[0:len(scoredT)][0:1]==input
compares the first element of scoredT
to input
. scoredT[0:len(scoredT)][0:1]==input
的第一个元素进行比较scoredT
到input
。 So it will be boolean. 因此它将是布尔值。 Then you pass it to
filter
, which requires the first argument to be a boolean valued function , hence your error. 然后将其传递给
filter
,它要求第一个参数是布尔值函数 ,因此会出错。 The pythonic way: pythonic方式:
matches = [(trigram, score) for (trigram, score) in scoredT if trigram[:2] == input]
Also you need to make sure that input
is a tuple. 另外,您还需要确保
input
是一个元组。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.