使用nltk Sentiwordnet和python

Question

I am doing sentiment analysis on twitter data using python NLTK. 我正在使用python NLTK对twitter数据进行情绪分析。 I need a dictionary which contains +ve and -ve polarities of words. 我需要一个包含+ ve和-ve极性单词的字典。 I have read so much stuff regarding sentiwordnet but when I am using it for my project it is not giving efficient and fast results. 我已经阅读了很多关于sentiwordnet的内容，但是当我将它用于我的项目时，它并没有提供有效和快速的结果。 I think I'm not using it correctly. 我想我没有正确使用它。 Can anyone tell me correct way to use it? 谁能告诉我使用它的正确方法？ Here are the steps I did up to now: 以下是我到目前为止所做的步骤：

tokenization of tweets 推文的标记化
POS tagging of tokens 标记的POS标记
passing each tags to sentinet 将每个标签传递给sentinet

I am using the nltk package for tokenization and tagging. 我正在使用nltk包进行标记化和标记。 See a part of my code below: 请参阅下面的代码部分：

import nltk
from nltk.stem import *
from nltk.corpus import sentiwordnet as swn

tokens=nltk.word_tokenize(row) #for tokenization, row is line of a file in which tweets are saved.
tagged=nltk.pos_tag(tokens) #for POSTagging

for i in range(0,len(tagged)):
     if 'NN' in tagged[i][1] and len(swn.senti_synsets(tagged[i][0],'n'))>0:
            pscore+=(list(swn.senti_synsets(tagged[i][0],'n'))[0]).pos_score() #positive score of a word
            nscore+=(list(swn.senti_synsets(tagged[i][0],'n'))[0]).neg_score()  #negative score of a word
    elif 'VB' in tagged[i][1] and len(swn.senti_synsets(tagged[i][0],'v'))>0:
           pscore+=(list(swn.senti_synsets(tagged[i][0],'v'))[0]).pos_score()
           nscore+=(list(swn.senti_synsets(tagged[i][0],'v'))[0]).neg_score()
    elif 'JJ' in tagged[i][1] and len(swn.senti_synsets(tagged[i][0],'a'))>0:
           pscore+=(list(swn.senti_synsets(tagged[i][0],'a'))[0]).pos_score()
           nscore+=(list(swn.senti_synsets(tagged[i][0],'a'))[0]).neg_score()
    elif 'RB' in tagged[i][1] and len(swn.senti_synsets(tagged[i][0],'r'))>0:
           pscore+=(list(swn.senti_synsets(tagged[i][0],'r'))[0]).pos_score()
           nscore+=(list(swn.senti_synsets(tagged[i][0],'r'))[0]).neg_score()

At the end I will be calculating how many tweets are positive and how many tweets are negative. 最后，我将计算有多少推文是正面的，有多少推文是否定的。 Where am I wrong? 我哪里错了？ How should I use it? 我该怎么用？ And is there any other similar kind of dictionary which is easy to use? 还有其他类似的字典易于使用吗？

Answer 1

Yes, there are other lexicons that you can use. 是的，您可以使用其他词典。 You can find a small list of lexicons here: http://sentiment.christopherpotts.net/lexicons.html#resources It seems Bing Liu's Opinion Lexicon is quite easy to use. 你可以在这里找到一个小词典列表： http ：//sentiment.christopherpotts.net/lexicons.html#resources看来Bing Liu的Opinion Lexicon很容易使用。

Apart from linking to those lexicons that website is a very nice tutorial on sentiment analysis. 除了链接那些词典，网站是一个非常好的情绪分析教程。

Answer 2

calculate the sentiment 计算情绪

alist = [all_tokens_in_doc]

totalScore = 0

count_words_included = 0

for word in all_words_in_comment:

    synset_forms = list(swn.senti_synsets(word[0], word[1]))

    if not synset_forms:

        continue

    synset = synset_forms[0] 

    totalScore = totalScore + synset.pos_score() - synset.neg_score()

    count_words_included = count_words_included +1

final_dec = ''

if count_words_included == 0:

    final_dec = 'N/A'

elif totalScore == 0:

    final_dec = 'Neu'        

elif totalScore/count_words_included < 0:

    final_dec = 'Neg'

elif totalScore/count_words_included > 0:

    final_dec = 'Pos'

return final_dec

使用nltk Sentiwordnet和python

问题描述

2 个解决方案

解决方案1
4 2015-12-23 11:28:45

解决方案2
0 2018-07-31 10:31:58

calculate the sentiment 计算情绪

使用nltk Sentiwordnet和python

问题描述

2 个解决方案

解决方案1 4 2015-12-23 11:28:45

解决方案2 0 2018-07-31 10:31:58

calculate the sentiment 计算情绪

解决方案1
4 2015-12-23 11:28:45

解决方案2
0 2018-07-31 10:31:58