[英]NLTK sentiment vader: ordering results
我剛剛對我的數據集進行了維德情緒分析:
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk import tokenize
sid = SentimentIntensityAnalyzer()
for sentence in filtered_lines2:
print(sentence)
ss = sid.polarity_scores(sentence)
for k in sorted(ss):
print('{0}: {1}, '.format(k, ss[k]), )
print()
這里是我的結果樣本:
Are these guests on Samsung and Google event mostly Chinese Wow Theyre
boring
Google Samsung
('compound: 0.3612, ',)
()
('neg: 0.12, ',)
()
('neu: 0.681, ',)
()
('pos: 0.199, ',)
()
Adobe lose 135bn to piracy Report
('compound: -0.4019, ',)
()
('neg: 0.31, ',)
()
('neu: 0.69, ',)
()
('pos: 0.0, ',)
()
Samsung Galaxy Nexus announced
('compound: 0.0, ',)
()
('neg: 0.0, ',)
()
('neu: 1.0, ',)
()
('pos: 0.0, ',)
()
我想知道“化合物”等於,大於或小於零的次數。
我知道這可能很容易,但是我對Python和一般的編碼真的很陌生。 我已經嘗試了多種方法來創建所需的內容,但是找不到任何解決方案。
(如果“結果樣本”不正確,請編輯我的問題,因為我不知道正確的書寫方式)
您可以為每個類使用一個簡單的計數器:
positive, negative, neutral = 0, 0, 0
然后,在句子循環內,測試復合值並增加相應的計數器:
...
if ss['compound'] > 0:
positive += 1
elif ss['compound'] == 0:
neutral += 1
elif ...
等等
到目前為止,這並不是最Python的方法,但是如果您對Python沒有太多的經驗,那么我認為這將是最容易理解的方法。 本質上,您創建一個具有0個值的字典,並在每種情況下都增加該值。
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk import tokenize
sid = SentimentIntensityAnalyzer()
res = {"greater":0,"less":0,"equal":0}
for sentence in filtered_lines2:
ss = sid.polarity_scores(sentence)
if ss["compound"] == 0.0:
res["equal"] +=1
elif ss["compound"] > 0.0:
res["greater"] +=1
else:
res["less"] +=1
print(res)
我可能會定義一個函數,該函數返回由文檔表示的不平等類型:
def inequality_type(val):
if val == 0.0:
return "equal"
elif val > 0.0:
return "greater"
return "less"
然后將其用於所有句子的復合分數,以增加相應不等式類型的計數。
from collections import defaultdict
def count_sentiments(sentences):
# Create a dictionary with values defaulted to 0
counts = defaultdict(int)
# Create a polarity score for each sentence
for score in map(sid.polarity_scores, sentences):
# Increment the dictionary entry for that inequality type
counts[inequality_type(score["compound"])] += 1
return counts
然后,您可以在已過濾的行中調用它。
但是,僅使用collections.Counter
可以避免這種情況:
from collections import Counter
def count_sentiments(sentences):
# Count the inequality type for each score in the sentences' polarity scores
return Counter((inequality_type(score["compound"]) for score in map(sid.polarity_scores, sentences)))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.