简体   繁体   English

如何在字典中搜索nltk词干?

[英]How can I search a dictionary for a nltk stem?

I'm having an issue checking to see if a stemmed word exists in a dictionary. 我在检查字典中是否存在词干时遇到问题。 This is for some sentiment analysis work that I am doing. 这是我正在做的一些情绪分析工作。 All I am getting back is this error here: 我得到的只是这里的错误:

Traceback (most recent call last):
File "sentiment.py", line 369, in <module>
score += int(senti_word_dict.get(get_stem(word)))
TypeError: int() argument must be a string or a number, not 'NoneType'

Here is my code for the method to look for a stemmed word through NLTK: 这是我的通过NLTK查找词干的方法的代码:

def get_stem(word):
    st = SnowballStemmer("english")
    stemmed_word = st.stem(word)
    return '' if stemmed_word is None else stemmed_word   

Here is the code for checking for that word against the dictionary: 这是用于对照字典检查该单词的代码:

for comment in all_comments:
    score = 0
    tokens = tokenize(comment)
    for word in tokens:
      if word in senti_word_dict:
        score += int(senti_word_dict.get(get_stem(word)))
    print(str(score)+" "+comment)
    print('\n')

For now I am just getting the score. 现在,我只是获得分数。 Is there a way that I can pass that stemmed word as a string to see what the score is in the dictionary? 有什么方法可以将词干单词作为字符串传递,以查看字典中的分数? If there is anything I am doing wrong or could do better let me know! 如果有任何事情我做错了或者可以做得更好,请告诉我! Thanks! 谢谢!

You check if word is in senti_word_dict . 您检查word是否在senti_word_dict Perhaps it is. 也许是。 But then you stem it (it becomes a different word!) and attempt to retrieve the stem from the dictionary with senti_word_dict.get . 但是随后您将其词干(它变成另一个词!),并尝试使用senti_word_dict.get从字典中检索词干。 If the stem is not in the dictionary (why should it be?), get() returns a None . 如果词干不在字典中(为什么会这样?),则get()返回None Thus, the error. 因此,错误。 Solution: first stem the word and only then look it up. 解决方案:首先阻止单词,然后再查找。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM