简体   繁体   English

如何在机器学习 model 中使用 test_proportion 数据?

[英]How can I use the test_proportion data in a machine learning model?

I have a data with 4000 CNN features and it is a binary classification problem.我有一个包含 4000 个 CNN 特征的数据,这是一个二元分类问题。 All I know about the test data is the proportions of 1 and 0. How can I tell to my model to predict test labels by using the proportions data?我所知道的测试数据是 1 和 0 的比例。如何告诉我的 model 使用比例数据预测测试标签? (Like is there a way to say in order to reach this proportions I will give this instance 0.) (就像有没有办法说为了达到这个比例,我会给这个实例0。)

How can I use it to increase accuracy?如何使用它来提高准确性? In my case the training data is mostly consist of 1 (85%) and 0(15%) However in my test data proportion of l is given as (%38) So it is much different than training data.在我的情况下,训练数据主要由 1 (85%) 和 0(15%) 组成,但是在我的测试数据中,l 的比例为 (%38),因此它与训练数据有很大不同。

I worked a little bit with balancing the data and it helped.我在平衡数据方面做了一些工作,这很有帮助。 However my model still predicts 1 for nearly all of the data.然而,我的 model 仍然预测几乎所有数据的 1。 It may occur because of the adaptation problem also.它也可能由于适应问题而发生。

As @birdwatch suggested I decrease the threshold for the 0 value and try to increase the 0 label count on the prediction.正如@birdwatch 建议的那样,我降低了 0 值的阈值并尝试增加预测中的 0 label 计数。

# Predicting the Test set results 
y_pred = classifier.predict_proba(X_test) 
threshold=0.3 
y_pred [:,0] = (y_pred [:,0] < threshold).astype('int') 

Before the number of classes were as in follows:前班数如下:

 1 :   8906
 0 :   2968

After changing threshold now it is现在更改阈值后

1 :  3221
0 :  8653

However is there any other way that I can use test_proportions which ensures the result?但是,还有其他方法可以使用 test_proportions 来确保结果吗?

There isn't any sensible way to that.没有任何明智的方法。 Doing so would create a weird bias in the model.这样做会在 model 中产生奇怪的偏差。 One thing you could do is accept the less likely outcome only is it has high enough score.你可以做的一件事是接受不太可能的结果,只有它有足够高的分数。 Normally you'd use 0.5 threshold, but here you might take eg 0.7.通常您会使用 0.5 阈值,但在这里您可能会使用例如 0.7。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何稳定机器学习 model? - How can I stabilize a machine learning model? 如何在python中用真实数据测试机器学习模型 - How to test machine learning model with real data in python 如何通过Python机器学习模型运行测试数据? - How do I run test data through my Python Machine Learning Model? 如何在机器学习中使用不同的数据集测试我的训练 model - How can I test my training model using a different dataset in machine learning 我如何在机器学习中使用不同的数据集测试我的 model - how can i test my model using different dataset in machine learning 我可以使用机器学习模型作为优化问题的目标函数吗? - Can I use a machine learning model as the objective function in an optimization problem? 如何将一系列 numpy ndarrays 作为输入数据来训练 tensorflow 机器学习模型? - How can I have a series of numpy ndarrays as the input data to train a tensorflow machine learning model? 如何将此代码中的数据保存在单独的文件中以创建机器学习 model? - How can I save data from this code in separate files to create a machine learning model? 我如何保存需要更少 memory 的机器学习 model - How can i save a machine learning model that takes less memory 如何使用机器学习解决时间序列问题 - How can I use machine learning for time series problem
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM