简体   繁体   中英

Sample k-fold cross validation in Python

I want to test k-fold (k=3) cross-validation in Python

I got this code from the web

import nltk # needed for Naive-Bayes
import numpy as np
from sklearn.model_selection import KFold

# data is an array with our already pre-processed dataset examples
kf = KFold(n_splits=3)
sum = 0
for train, test in kf.split(data):
    train_data = np.array(data)[train]
    test_data = np.array(data)[test]
    classifier = nltk.NaiveBayesClassifier.train(train_data)
    sum += nltk.classify.accuracy(classifier, test_data)
average = sum/3

and add:

data = [10, 20, 30, 40, 50]

error result:

Traceback (most recent call last):
  File "/Users/david/PycharmProjects/iranian-01/pandas_test.py", line 12, in <module>
classifier = nltk.NaiveBayesClassifier.train(train_data)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nltk/classify/naivebayes.py", line 194, in train
for featureset, label in labeled_featuresets:
TypeError: 'numpy.int64' object is not iterable

please help me to solve this

You should to fit your data before training and testing. ie you are going to make a 3-fold cross validation to these data

data = [10, 20, 30, 40, 50]

so, the result will be a floating point number when the compiler splitting it. I advice you to work with numpy rather than a Python purely arrays, to be able to use the pre-defined functions in this library.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM