AttributeError: 'list' object has no attribute analyze

Question

I was trying to calculate tf-idf and here is my code:

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from nltk.corpus import stopwords
import numpy as np
import numpy.linalg as LA

train_set = ["The sky is blue.", "The sun is bright."] #Documents
test_set = ["The sun in the sky is bright sun."] #Query
stopWords = stopwords.words('english')

vectorizer = CountVectorizer(stopWords)
#print vectorizer
transformer = TfidfTransformer() 
#print transformer

trainVectorizerArray = vectorizer.fit_transform(train_set).toarray()
testVectorizerArray = vectorizer.transform(test_set).toarray()
print 'Fit Vectorizer to train set', trainVectorizerArray
print 'Transform Vectorizer to test set', testVectorizerArray

transformer.fit(trainVectorizerArray)
print
print transformer.transform(trainVectorizerArray).toarray()

transformer.fit(testVectorizerArray)
print
tfidf = transformer.transform(testVectorizerArray)
print tfidf.todense()

I am getting this error:

Traceback (most recent call last):
File "tf-idf2.py", line 16, in <module>
trainVectorizerArray = vectorizer.fit_transform(train_set).toarray()
File "/usr/lib/pymodules/python2.7/sklearn/feature_extraction/text.py", line 341,
in fit_transform
term_count_current = Counter(self.analyzer.analyze(doc))
AttributeError: 'list' object has no attribute 'analyze'

I am using scikit version 0.14.1.

Answer 1

CountVectorizer(stopWords)

should be

CountVectorizer(stop_words=stopWords)

Always use keyword arguments for the constructor parameters of scikit-learn objects, unless indicated otherwise in the docs.

AttributeError: 'list' object has no attribute analyze

Question

1 answers

solution1
0 2014-03-29 11:45:56

AttributeError: 'list' object has no attribute analyze

Question

1 answers

solution1 0 2014-03-29 11:45:56

solution1
0 2014-03-29 11:45:56