简体   繁体   English

从文本中提取正面和负面的词?

[英]Extract positive and negative words from text?

I need to find the opinion of certain reviews given in websites.我需要找到网站上给出的某些评论的意见。 I am using sentiwordnet for this.我为此使用了 sentiwordnet。 I first send the file containing all the reviews to POS Tagger.我首先将包含所有评论的文件发送到 POS Tagger。

Is there any other accurate way of tokenizing which considers not good as 1 word other than considering it as 2 separate words.除了将其视为 2 个单独的单词外,是否还有其他准确的标记方法将其视为不好的 1 个单词。

Now I have to give postive and negative score to the tokenized words and then calculate the total score.现在我必须对标记化的单词给出正分和负分,然后计算总分。 Is there any function in sentiwordnet for this. sentiwordnet 中是否有任何功能。 please help.请帮忙。

import nltk
from not.tokenize import sent_tokenize, word_tokenize
import CSV

para = "What can I say about this place. The staff of the restaurant is nice and the eggplant is not bad. Apart from that, very uninspired food, lack of atmosphere and too expensive. I am a staunch vegetarian and was sorely disappointed with the veggie options on the menu. Will be the last time I visit, I recommend others to avoid"

sentense = word_tokenize(para)
word_features = []

for i,j in nltk.pos_tag(sentense):
    if j in ['JJ', 'JJR', 'JJS', 'RB', 'RBR', 'RBS']:
        word_features.append(i)

rating = 0

for i in word_features:
    with open('words.txt', 'rt') as f:
        reader = csv.reader(f, delimiter=',')
        for row in reader:
            if i == row[0]:
                print i, row[1]
                if row[1] == 'pos':
                    rating = rating + 1
                elif row[1] == 'neg':
                    rating = rating - 1
print  rating

Error:错误:

Traceback (most recent call last):
  File "E:/Emotional from text/pORnOfWord.py", line 10, in <module>
    for i,j in nltk.pos_tag(sentense):
  File "C:\Python27\lib\site-packages\nltk\tag\__init__.py", line 99, in pos_tag
    tagger = load(_POS_TAGGER)
  File "C:\Python27\lib\site-packages\nltk\data.py", line 605, in load
    resource_val = pickle.load(_open(resource_url))
  File "C:\Python27\lib\site-packages\nltk\data.py", line 686, in _open
    return find(path).open()
  File "C:\Python27\lib\site-packages\nltk\data.py", line 467, in find
    raise LookupError(resource_not_found)
LookupError:
**********************************************************************
  Resource 'taggers/maxent_treebank_pos_tagger/english.pickle' not
  found.  Please use the NLTK Downloader to obtain the resource:
  >>> nltk.download()
  Searched in:
    - `enter code here`'C:\\Users\\Eman\x99/nltk_data'
    - 'C:\\nltk_data'
    - 'D:\\nltk_data'
    - 'E:\\nltk_data'
    - 'C:\\Python27\\nltk_data'
    - 'C:\\Python27\\lib\\nltk_data'
    - 'C:\\Users\\Eman\x99\\AppData\\Roaming\\nltk_data'

The error has occurred due to missing nltk packages, which will be resolved by downloading the package, execute the below code to resolve the issue由于缺少nltk包而发生错误,将通过下载包解决,执行以下代码解决问题

import nltk 
nltk.download()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM