Python NLTK Classifier.train（trainfeats）…ValueError：需要多個值才能解壓

Question

def word_feats(words):
     return dict([(word, True) for word in words])

for tweet in negTweets:
     words = re.findall(r"[\w']+|[.,!?;]", tweet) #splits the tweet into words
     negwords = [(word_feats(words), 'neg')] #tag the words with feature
     negfeats.append(negwords) #add the words to the feature list
for tweet in posTweets:
     words = re.findall(r"[\w']+|[.,!?;]", tweet)
     poswords = [(word_feats(words), 'pos')]
     posfeats.append(poswords)

negcutoff = len(negfeats)*3/4 #take 3/4ths of the words
poscutoff = len(posfeats)*3/4

trainfeats = negfeats[:negcutoff] + posfeats[:poscutoff] #assemble the train set
testfeats = negfeats[negcutoff:] + posfeats[poscutoff:]

classifier = NaiveBayesClassifier.train(trainfeats)
print 'accuracy:', nltk.classify.util.accuracy(classifier, testfeats)
classifier.show_most_informative_features()

運行此代碼時出現以下錯誤...

File "C:\Python27\lib\nltk\classify\naivebayes.py", line 191, in train

for featureset, label in labeled_featuresets:

ValueError: need more than 1 value to unpack

錯誤來自分類= NaiveBayesClassifier.train（trainfeats）行，我不確定為什么。 我之前已經做過類似的事情，並且我的trainfeats接縫的格式與那時相同...下面列出了該格式的示例...

[[[{{'me'：True，'af'：True，'this'：True，'joy'：True，'high'：True，'hookah'：True，'got'：True}，'pos' ）]]

我的trainfeats創建分類器還需要其他什么價值？ 強調文字

Answer 1

@Prune的注釋是正確的：您的labeled_featuresets應該是一對對的序列（兩個元素的列表或元組）：每個數據點的特征字典和類別。 相反， trainfeats中的每個元素都是一個包含一個元素的列表：這兩個東西的元組。 在兩個功能構建循環中都丟失了方括號，該部分應正常工作。 例如，

negwords = (word_feats(words), 'neg')
negfeats.append(negwords)

還有兩件事：考慮使用nltk.word_tokenize()而不是自己進行標記化。 並且您應該將訓練數據的順序隨機化，例如使用random.scramble(trainfeats) 。

Python NLTK Classifier.train（trainfeats）…ValueError：需要多個值才能解壓

問題描述

1 個解決方案

解決方案1
1 2016-11-10 20:24:40

Python NLTK Classifier.train（trainfeats）…ValueError：需要多個值才能解壓

問題描述

1 個解決方案

解決方案1 1 2016-11-10 20:24:40

解決方案1
1 2016-11-10 20:24:40