简体   繁体   English

解释 cross_val_score scikit_learn 参数 cv

[英]Explication cross_val_score scikit_learn parameter cv

I don't understand why i have different result in this configuration of cross_val_score and a simple model.我不明白为什么我在这个 cross_val_score 的配置和一个简单的模型中有不同的结果。

from sklearn.datasets import load_iris
from sklearn.utils import shuffle
from sklearn import tree
import numpy as np

np.random.seed(1234)
iris = load_iris()
X, y = iris.data, iris.target
X,y = shuffle(X,y)

print(y)
clf = tree.DecisionTreeClassifier(max_depth=2,class_weight={2: 0.3, 1: 10,0:0.3},random_state=1234)
clf2 = clf.fit(X, y)
tree.plot_tree(clf2)
from  sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
predi = clf2.predict(X)
cm =  confusion_matrix(y_true=y, y_pred=predi)
print(cm)
print("Accuracy = ",round(accuracy_score(y,predi)* 100.0,2))

from sklearn.model_selection import cross_val_score,cross_val_predict
max_id = len(X)
limit = round(max_id*0.6,0)
min_id=0
train = np.arange(0,limit)
test = np.arange(limit,max_id)
test = [int(x) for x in test]
train = [int(x) for x in train]
print(train)
print(test)
predi = cross_val_score(clf,X,y,cv=[(train,test)])
print(predi)
train = X[train[0]:train[-1]]
y_train =  y[train[0]:train[-1]]
Xtest = X[test[0]:test[-1]]
y_test =  y[test[0]:test[-1]]


clf3 = clf.fit(Xtrain,y_train)
predi = clf3.predict(Xtest)
cm =  confusion_matrix(y_true=y_test, y_pred=predi)
print(cm)
print("Accuracy = ",round(accuracy_score(y_test,predi)* 100.0,2))

I don't understand why i have different accuracy whereas i have the same parameters en the same train test sample我不明白为什么我有不同的准确性,而我在相同的火车测试样本中具有相同的参数

Basically, the kind of data split you use will have an impact on your model accuracy.基本上,您使用的数据拆分类型会对您的模型准确性产生影响。 This is well documented in machine learning field.这在机器学习领域有据可查。 Secondly, your first model is strictly biased as you have used your training set for testing which will result in ~100% accuracy.其次,您的第一个模型有严格的偏差,因为您使用了训练集进行测试,这将导致大约 100% 的准确度。

https://www.analyticsvidhya.com/blog/2021/05/4-ways-to-evaluate-your-machine-learning-model-cross-validation-techniques-with-python-code/ https://www.analyticsvidhya.com/blog/2021/05/4-ways-to-evaluate-your-machine-learning-model-cross-validation-techniques-with-python-code/

https://towardsdatascience.com/train-test-split-c3eed34f763bhttps://towardsdatascience.com/train-test-split-c3eed34f763b

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 scikit.learn cross_val_score 中的错误 - Error in scikit.learn cross_val_score 在cross_val_score 中,参数cv 的使用方式有何不同? - In cross_val_score, how is the parameter cv being used differently? 使用 cross_val_predict 与 cross_val_score 时,scikit-learn 分数不同 - scikit-learn scores are different when using cross_val_predict vs cross_val_score 如何将 f1_score arguments 传递给 scikit 中的 make_scorer 学习与 cross_val_score 一起使用? - How to pass f1_score arguments to the make_scorer in scikit learn to use with cross_val_score? 交叉验证:来自scikit-learn参数的cross_val_score函数 - Cross validation: cross_val_score function from scikit-learn arguments “得分必须返回一个数字”scikit-learn中的cross_val_score错误 - “scoring must return a number” cross_val_score error in scikit-learn 包装器自定义 class 用于 scikit-learn 的迭代输入器,与 cross_val_score() 一起使用 - Wrapper custom class for scikit-learn's Iterative Imputer for use with cross_val_score() Scikit:使用cross_val_score函数计算精度和召回率 - Scikit: calculate precision and recall using cross_val_score function 了解 kfold scitkit 中的 cross_val_score 学习 - Understanding cross_val_score in kfold scitkit learn Scikit-learn cross_val_score 抛出 ValueError:必须始终传递“Layer.call”的第一个参数 - Scikit-learn cross_val_score throws ValueError: The first argument to `Layer.call` must always be passed
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM