简体   繁体   中英

dictionary for sentiment analysis in NLTK

I am new to python and NLTk. I have a model created for sentiment analysis of survey in NLTK (NaivesBayesCalssifier). To improve the accuracy, i wanted to add some dictionary containing list of positive and negative statements in the model. Is there any module in NLTK and are there any additional features that can improve my model?

You can have a look at some public sentiment lexicons which would provide you a corpus of positive and negative words.

One of them can be found at https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html

Since, you haven't specified any details about your current model, I'm assuming you are using a very basic Naive Bayes classifier. If you are using unigrams(words) to vectorize your text right now, then you can consider using bigrams or trigrams for generating the feature vectors.This would basically, enable you to use the contextual information of the words to a certain extent.

If you are currently using a bag of words model like Tfidf to convert your text to converts then you can consider using word embeddings instead of that. Bag of words doesn't consider the contextual information of the words whereas, word embeddings are able to capitalise on that.

You could use somethings like gensim which uses deep learning to convert words to vectors. Have a look at : https://radimrehurek.com/gensim/models/word2vec.html

Furthermore, you can always try using a linearSVC classifier or a logistic regression classifier and choose whichever one gives the best accuracy.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM