![](/img/trans.png)
[英]ValueError: need more than 1 value to unpack with classifier in scikit-learn
[英]Python NLTK Classifier.train(trainfeats)… ValueError: need more than 1 value to unpack
def word_feats(words):
return dict([(word, True) for word in words])
for tweet in negTweets:
words = re.findall(r"[\w']+|[.,!?;]", tweet) #splits the tweet into words
negwords = [(word_feats(words), 'neg')] #tag the words with feature
negfeats.append(negwords) #add the words to the feature list
for tweet in posTweets:
words = re.findall(r"[\w']+|[.,!?;]", tweet)
poswords = [(word_feats(words), 'pos')]
posfeats.append(poswords)
negcutoff = len(negfeats)*3/4 #take 3/4ths of the words
poscutoff = len(posfeats)*3/4
trainfeats = negfeats[:negcutoff] + posfeats[:poscutoff] #assemble the train set
testfeats = negfeats[negcutoff:] + posfeats[poscutoff:]
classifier = NaiveBayesClassifier.train(trainfeats)
print 'accuracy:', nltk.classify.util.accuracy(classifier, testfeats)
classifier.show_most_informative_features()
運行此代碼時出現以下錯誤...
File "C:\Python27\lib\nltk\classify\naivebayes.py", line 191, in train
for featureset, label in labeled_featuresets:
ValueError: need more than 1 value to unpack
錯誤來自分類= NaiveBayesClassifier.train(trainfeats)行,我不確定為什么。 我之前已經做過類似的事情,並且我的trainfeats接縫的格式與那時相同...下面列出了該格式的示例...
[[[{{'me':True,'af':True,'this':True,'joy':True,'high':True,'hookah':True,'got':True},'pos' )]]
我的trainfeats創建分類器還需要其他什么價值? 強調文字
@Prune的注釋是正確的:您的labeled_featuresets
應該是一對對的序列(兩個元素的列表或元組):每個數據點的特征字典和類別。 相反, trainfeats
中的每個元素都是一個包含一個元素的列表:這兩個東西的元組。 在兩個功能構建循環中都丟失了方括號,該部分應正常工作。 例如,
negwords = (word_feats(words), 'neg')
negfeats.append(negwords)
還有兩件事:考慮使用nltk.word_tokenize()
而不是自己進行標記化。 並且您應該將訓練數據的順序隨機化,例如使用random.scramble(trainfeats)
。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.