解释 cross_val_score scikit_learn 参数 cv

Question

I don't understand why i have different result in this configuration of cross_val_score and a simple model.我不明白为什么我在这个 cross_val_score 的配置和一个简单的模型中有不同的结果。

from sklearn.datasets import load_iris
from sklearn.utils import shuffle
from sklearn import tree
import numpy as np

np.random.seed(1234)
iris = load_iris()
X, y = iris.data, iris.target
X,y = shuffle(X,y)

print(y)
clf = tree.DecisionTreeClassifier(max_depth=2,class_weight={2: 0.3, 1: 10,0:0.3},random_state=1234)
clf2 = clf.fit(X, y)
tree.plot_tree(clf2)
from  sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
predi = clf2.predict(X)
cm =  confusion_matrix(y_true=y, y_pred=predi)
print(cm)
print("Accuracy = ",round(accuracy_score(y,predi)* 100.0,2))

from sklearn.model_selection import cross_val_score,cross_val_predict
max_id = len(X)
limit = round(max_id*0.6,0)
min_id=0
train = np.arange(0,limit)
test = np.arange(limit,max_id)
test = [int(x) for x in test]
train = [int(x) for x in train]
print(train)
print(test)
predi = cross_val_score(clf,X,y,cv=[(train,test)])
print(predi)
train = X[train[0]:train[-1]]
y_train =  y[train[0]:train[-1]]
Xtest = X[test[0]:test[-1]]
y_test =  y[test[0]:test[-1]]


clf3 = clf.fit(Xtrain,y_train)
predi = clf3.predict(Xtest)
cm =  confusion_matrix(y_true=y_test, y_pred=predi)
print(cm)
print("Accuracy = ",round(accuracy_score(y_test,predi)* 100.0,2))

I don't understand why i have different accuracy whereas i have the same parameters en the same train test sample我不明白为什么我有不同的准确性，而我在相同的火车测试样本中具有相同的参数

Answer 1

Basically, the kind of data split you use will have an impact on your model accuracy.基本上，您使用的数据拆分类型会对您的模型准确性产生影响。 This is well documented in machine learning field.这在机器学习领域有据可查。 Secondly, your first model is strictly biased as you have used your training set for testing which will result in ~100% accuracy.其次，您的第一个模型有严格的偏差，因为您使用了训练集进行测试，这将导致大约 100% 的准确度。

https://www.analyticsvidhya.com/blog/2021/05/4-ways-to-evaluate-your-machine-learning-model-cross-validation-techniques-with-python-code/ https://www.analyticsvidhya.com/blog/2021/05/4-ways-to-evaluate-your-machine-learning-model-cross-validation-techniques-with-python-code/

https://towardsdatascience.com/train-test-split-c3eed34f763bhttps://towardsdatascience.com/train-test-split-c3eed34f763b

解释 cross_val_score scikit_learn 参数 cv

问题描述

1 个解决方案

解决方案1
0 2022-05-31 09:55:57

解释 cross_val_score scikit_learn 参数 cv

问题描述

1 个解决方案

解决方案1 0 2022-05-31 09:55:57

解决方案1
0 2022-05-31 09:55:57