简体   繁体   English

如何在机器学习中使用不同的数据集测试我的训练 model

[英]How can I test my training model using a different dataset in machine learning

Hello I am very new to Python and machine learning and I am running into a issue.您好,我对 Python 和机器学习非常陌生,我遇到了一个问题。 After splitting and completing my training and testing models, now I need to test a complete different dataset.在拆分并完成我的训练和测试模型之后,现在我需要测试一个完全不同的数据集。

Below is how I created my training and test:以下是我创建培训和测试的方式:

Using NaiveBayes Classifier model nb_model = sklearn.naive_bayes.MultinomialNB() nb_model.fit(X_train_v, y_train) y_pred_class = nb_model.predict(X_test_v) y_pred_probs = nb_model.predict_proba(X_test_v)使用 NaiveBayes 分类器 model nb_model = sklearn.naive_bayes.MultinomialNB() nb_model.fit(X_train_v, y_train) y_pred_class = nb_model.predict(X_test_v) y_pred_probs = nb_model.predict_proba(X_test_v)

What would I need to adjust in order to change the dataset that I am using so I can run a new dataset to the training model.我需要调整什么才能更改我正在使用的数据集,以便我可以将新数据集运行到训练 model。

Thank you for your time and your help!感谢您的时间和帮助!

Specifically and functionally speaking, your new dataset should have the same number of features.具体而言,从功能上讲,您的新数据集应该具有相同数量的特征。

If x_train.shape gives (752, 8) , then you know it has 8 features and 752 samples.如果x_train.shape给出(752, 8) ,那么你知道它有 8 个特征和 752 个样本。

After that your model was trained on it, you can be sure that model.n_features will give you 8 .之后,您的 model 接受了培训,您可以确定model.n_features会给您8

Your model now is able to predict outputs from data with 8 features:您的 model 现在能够从具有 8 个特征的数据中预测输出:

import numpy as np
# 10 randomly generated samples with 8 features
new_dataset_1 = np.random.randint(0, 100, size=(10, 8))
new_pred_1 = model.predict(new_dataset_1)
# > array([47, 15,  2, 81, 99, 63, 53, 55, 24, 47])
new_pred_1.shape
# > (10, )  # One predicted class per sample

If you try to predict from data that has any other count of features, it will fail:如果您尝试从具有任何其他特征计数的数据中进行预测,它将失败:

# 10 randomly generated samples with 9 features
new_dataset_2 = np.random.randint(0, 100, size=(10, 9))
new_pred_2 = model.predict(new_dataset_2)
# > ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0,
# with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 8 is different from 9)

In other instances, there might be ways to get the same amount of features, but it all depends on the hypothesis, on the kind of data or on the tested model.在其他情况下,可能有办法获得相同数量的特征,但这完全取决于假设、数据类型或测试的 model。

Of course, this is just an illustration and it doesn't make any sense to predict on randomly generated data.当然,这只是一个说明,对随机生成的数据进行预测没有任何意义。 Your new data should instead represent something that is related to the training data.相反,您的新数据应该代表与训练数据相关的内容。

For example, you can consider that it is reasonable to try to predict the reproductive rate of fire ants from Austria with a model that you trained on the reproductive rate of fire ants from Germany.例如,您可以考虑使用您训练的德国火蚁繁殖率的 model 来预测奥地利火蚁的繁殖率是合理的。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我如何在机器学习中使用不同的数据集测试我的 model - how can i test my model using different dataset in machine learning 当我使用不同的数据集在机器学习中测试 model 时,为什么结果不准确? - Why results are inaccurate when I am using different dataset for testing a model in Machine Learning? 如何在不使用和拆分测试集的情况下将我的数据集拆分为训练和验证? - How can i split my dataset into training and validation with no using and spliting test set? 我可以删除测试数据集中的列吗? 机器学习 - can I delete columns in test dataset? machine learning 在不同的数据集上运行经过培训的机器学习模型 - Run trained Machine Learning model on a different dataset 如何稳定机器学习 model? - How can I stabilize a machine learning model? 如何通过Python机器学习模型运行测试数据? - How do I run test data through my Python Machine Learning Model? 如何在机器学习 model 中使用 test_proportion 数据? - How can I use the test_proportion data in a machine learning model? 在机器学习(线性回归)中,在训练/测试过程中我得到了这个类型错误。 有人可以帮我吗? - In machine learning(Linear Regression), in the training/test process I got this Type Error. Can someone help me with that? 如何提高线性回归模型的准确性?(使用python进行机器学习) - How can I increase the accuracy of my Linear Regression model?(machine learning with python)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM