[英]Explication cross_val_score scikit_learn parameter cv
I don't understand why i have different result in this configuration of cross_val_score and a simple model.我不明白为什么我在这个 cross_val_score 的配置和一个简单的模型中有不同的结果。
from sklearn.datasets import load_iris
from sklearn.utils import shuffle
from sklearn import tree
import numpy as np
np.random.seed(1234)
iris = load_iris()
X, y = iris.data, iris.target
X,y = shuffle(X,y)
print(y)
clf = tree.DecisionTreeClassifier(max_depth=2,class_weight={2: 0.3, 1: 10,0:0.3},random_state=1234)
clf2 = clf.fit(X, y)
tree.plot_tree(clf2)
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
predi = clf2.predict(X)
cm = confusion_matrix(y_true=y, y_pred=predi)
print(cm)
print("Accuracy = ",round(accuracy_score(y,predi)* 100.0,2))
from sklearn.model_selection import cross_val_score,cross_val_predict
max_id = len(X)
limit = round(max_id*0.6,0)
min_id=0
train = np.arange(0,limit)
test = np.arange(limit,max_id)
test = [int(x) for x in test]
train = [int(x) for x in train]
print(train)
print(test)
predi = cross_val_score(clf,X,y,cv=[(train,test)])
print(predi)
train = X[train[0]:train[-1]]
y_train = y[train[0]:train[-1]]
Xtest = X[test[0]:test[-1]]
y_test = y[test[0]:test[-1]]
clf3 = clf.fit(Xtrain,y_train)
predi = clf3.predict(Xtest)
cm = confusion_matrix(y_true=y_test, y_pred=predi)
print(cm)
print("Accuracy = ",round(accuracy_score(y_test,predi)* 100.0,2))
I don't understand why i have different accuracy whereas i have the same parameters en the same train test sample我不明白为什么我有不同的准确性,而我在相同的火车测试样本中具有相同的参数
Basically, the kind of data split you use will have an impact on your model accuracy.基本上,您使用的数据拆分类型会对您的模型准确性产生影响。 This is well documented in machine learning field.
这在机器学习领域有据可查。 Secondly, your first model is strictly biased as you have used your training set for testing which will result in ~100% accuracy.
其次,您的第一个模型有严格的偏差,因为您使用了训练集进行测试,这将导致大约 100% 的准确度。
https://www.analyticsvidhya.com/blog/2021/05/4-ways-to-evaluate-your-machine-learning-model-cross-validation-techniques-with-python-code/ https://www.analyticsvidhya.com/blog/2021/05/4-ways-to-evaluate-your-machine-learning-model-cross-validation-techniques-with-python-code/
https://towardsdatascience.com/train-test-split-c3eed34f763bhttps://towardsdatascience.com/train-test-split-c3eed34f763b
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.