简体   繁体   English

在测试和训练数据集上使用朴素贝叶斯 function

[英]Using the naive Bayes function on a test and training set of data

I am trying to use the NaiveBayes function on a training and test set of data.我正在尝试在训练和测试数据集上使用 NaiveBayes function。 I am using this helpful website: https://rpubs.com/riazakhan94/naive_bayes_classifier_e1071我正在使用这个有用的网站: https://rpubs.com/riazakhan94/naive_bayes_classifier_e1071

However, for some reason it is not working and this is error that I am getting:" Error in table(train$Class, trainPred): all arguments must have the same length. "但是,由于某种原因它不起作用,这是我得到的错误:“表中的错误(train$Class,trainPred):所有 arguments 必须具有相同的长度。”

Here is the code that I am using, I am guessing its a super simple fix.这是我正在使用的代码,我猜它是一个超级简单的修复。 The x and y columns of the data set are predicting on the class column:数据集的 x 和 y 列在 class 列上进行预测:

https://github.com/samuelc12359/NaiveBayes.git https://github.com/samuelc12359/NaiveBayes.git


test <- read.csv(file="TestX.csv",header=FALSE)
train <- read.csv(file="TrainX.csv",header=FALSE)

Names <- c("x","y","Class")
colnames(test)<- Names
colnames(train)<- Names

NBclassfier=naiveBayes(Class~x+y, data=train)
print(NBclassfier)


trainPred=predict(NBclassfier,train, type="class")
trainTable=table(train$Class, trainPred)
testPred=predict(NBclassfier, newdata=test, type="class")
testTable=table(test$Class, testPred)
print(trainTable)
print(testTable)

You need to turn the Class column into factors, eg like this:您需要将Class列转换为因子,例如:

train$Class = factor(train$Class)
test$Class = factor(test$Class)

Then when you call naiveBayes() to train, and later to predict, it will do what you expect.然后,当您调用naiveBayes()进行训练并随后进行预测时,它会按照您的预期进行。

Alternatively, you can change prediction type to "raw" and turn them into outcomes directly.或者,您可以将预测类型更改为"raw"并直接将它们转换为结果。 Eg like this:比如像这样:

train_predictions = predict(NBclassfier,train, type="raw")
trainPred = 1 * (train_predictions[, 2] >= 0.5 )
trainTable=table(train$Class, trainPred)
test_predictions = predict(NBclassfier, newdata=test, type="raw")
testPred = 1 * (test_predictions[, 2] >= 0.5 )
testTable=table(test$Class, testPred)
print(trainTable)
print(testTable)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM