简体   繁体   English

发现句子的观点是正面还是负面

[英]to find the opinion of a sentence as positive or negative

i need to find the opinion of certain reviews given in websites. 我需要找到网站上给出的某些评论的意见。 i am using sentiwordnet for this. 我为此使用sendiwordnet。 i first send the file containing all the reviews to POS Tagger. 我首先将包含所有评论的文件发送到POS Tagger。

tokens=nltk.word_tokenize(line) #tokenization for line in file
tagged=nltk.pos_tag(tokens) #for POSTagging
print tagged

Is there any other accurate way of tokenizing which considers not good as 1 word other than considering it as 2 separate words. 除了将其视为2个单独的单词以外,是否还有其他其他不正确的令牌化方法认为不是1个单词。

Now i have to give postive and negative score to the tokenized words and then calculate the total score. 现在,我必须给标记词加上正负分数,然后计算总分数。 Is there any function in sentiwordnet for this. 在sendiwordnet中是否有此功能。 please help. 请帮忙。

See First Extract Adverbs and Adjectives from review for example: 例如,请参阅从评论中首先提取副词和形容词:

import nltk
from nltk.tokenize import sent_tokenize, word_tokenize
import csv

para = "What can I say about this place. The staff of the restaurant is nice and the eggplant is not bad. Apart from that, very uninspired food, lack of atmosphere and too expensive. I am a staunch vegetarian and was sorely dissapointed with the veggie options on the menu. Will be the last time I visit, I recommend others to avoid"

sentense = word_tokenize(para)
word_features = []

for i,j in nltk.pos_tag(sentense):
    if j in ['JJ', 'JJR', 'JJS', 'RB', 'RBR', 'RBS']: 
        word_features.append(i)

rating = 0

for i in word_features:
    with open('words.txt', 'rt') as f:
        reader = csv.reader(f, delimiter=',')
        for row in reader:
            if i == row[0]:
                print i, row[1]
                if row[1] == 'pos':
                    rating = rating + 1
                elif row[1] == 'neg':
                    rating = rating - 1
print  rating

Now you must have a external csv file in which you should have positive and negative words 现在,您必须具有一个外部csv文件,其中应包含正词和负词

like : wrinkle,neg wrinkled,neg wrinkles,neg masterfully,pos masterpiece,pos masterpieces,pos 喜欢:皱纹,负皱纹,负皱纹,精通负,pos杰作,pos杰作,pos

Working of the above script as follows: 以上脚本的工作方式如下:

1 . 1。 read sentence 2 . 阅读句子2。 extract adverb and adjectives 3 . 提取副词和形容词3。 compare to CVS for positive and negative words 4 . 与CVS比较正负单词4。 and then rate the sentence 然后给句子评分

Result of above script is : 以上脚本的结果是:

nice pos  
bad neg  
expensive neg  
sorely neg  
-2

change result as per your need. 根据您的需要更改结果。 and sorry for my english :P 对不起我的英语:P

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM