使用scikit-learn進行有監督的機器學習

Question

這是我第一次進行有監督的機器學習。 這是一個相當高級的話題（至少對我而言），而且我不確定要指出什么問題，因為我不確定出什么問題了。

# Create a training list and test list (looks something like this):
train = [('this hostel was nice',2),('i hate this hostel',1)]
test = [('had a wonderful time',2),('terrible experience',1)]

# Loading modules
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
from sklearn import metrics

# Use a BOW representation of the reviews
vectorizer = CountVectorizer(stop_words='english') 
train_features = vectorizer.fit_transform([r[0] for r in train]) 
test_features = vectorizer.fit([r[0] for r in test])

# Fit a naive bayes model to the training data
nb = MultinomialNB()
nb.fit(train_features, [r[1] for r in train])

# Use the classifier to predict classification of test dataset
predictions = nb.predict(test_features)
actual=[r[1] for r in test]

在這里我得到錯誤：

float() argument must be a string or a number, not 'CountVectorizer'

這使我感到困惑，因為我在評論中獲得的原始評分是：

type(ratings_new[0])
int

Answer 1

你應該換線

test_features = vectorizer.fit([r[0] for r in test])

至：

test_features = vectorizer.transform([r[0] for r in test])

原因是您已經使用了訓練數據來擬合矢量化器，因此您無需再次將其擬合到測試數據中。 相反，您需要對其進行轉換。

使用scikit-learn進行有監督的機器學習

問題描述

1 個解決方案

解決方案1
1 已采納 2017-03-28 21:27:58

使用scikit-learn進行有監督的機器學習

問題描述

1 個解決方案

解決方案1 1 已采納 2017-03-28 21:27:58

解決方案1
1 已采納 2017-03-28 21:27:58