简体   繁体   English

如何使用 sklearn 使用 10 倍交叉验证获得 10 个单独的混淆矩阵

[英]How to get 10 individual confusion matrices using 10-fold cross validation using sklearn

I'm new to machine leaning so this is my first time using sklearn packages.我是机器学习的新手,所以这是我第一次使用 sklearn 包。 In this classification problem I want to get confusion matrix for each fold, but I get only one, this is what I have done so far.在这个分类问题中,我想获得每个折叠的混淆矩阵,但我只得到一个,这是我到目前为止所做的。 I haven't added the preprocessing part here.我没有在这里添加预处理部分。

from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import cross_val_predict

target = df["class"]
features = df.drop("class", axis=1)
split_df = round(0.8 * len(df))

features = features.sample(frac=1, random_state=0)
target = target.sample(frac=1, random_state=0)

trainFeatures, trainClassLabels = features.iloc[:split_df], target.iloc[:split_df]
testFeatures, testClassLabels = features.iloc[split_df:], target.iloc[split_df:]

tree = DecisionTreeClassifier(random_state=0)
tree.fit(X=trainFeatures, y=trainClassLabels)

y_pred = cross_val_predict(tree, X=features, y=target, cv=10)

conf_matrix = confusion_matrix(target, y_pred)
print("Confusion matrix:\n", conf_matrix)

You would need to provide the split using Kfold , instead of specifying cv=10.您需要使用Kfold来提供拆分,而不是指定 cv=10。 For example:例如:

from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import KFold, cross_val_predict
from sklearn.datasets import make_classification

features, target = make_classification(random_state=0)

tree = DecisionTreeClassifier(random_state=0)
kf = KFold(10,random_state=99,shuffle=True)

y_pred = cross_val_predict(tree, X=features, y=target, cv=kf)

conf_matrix = confusion_matrix(target, y_pred)
print("Confusion matrix:\n", conf_matrix)

Confusion matrix:
 [[41  9]
 [ 6 44]]

Then we can make the confusion matrix for each fold:然后我们可以为每个折叠制作混淆矩阵:

lst = []
for train_index, test_index in kf.split(features):
    lst.append(confusion_matrix(target[test_index], y_pred[test_index]))
    

It looks like this:它看起来像这样:

[array([[4, 0],
        [0, 6]]),
 array([[4, 3],
        [1, 2]]),
 array([[2, 0],
        [2, 6]]),
 array([[5, 1],
        [0, 4]]),
 array([[4, 1],
        [1, 4]]),
 array([[2, 2],
        [0, 6]]),
 array([[4, 0],
        [0, 6]]),
 array([[4, 1],
        [1, 4]]),
 array([[4, 1],
        [1, 4]]),
 array([[8, 0],
        [0, 2]])]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在scikit中进行10倍交叉验证的混淆矩阵学习 - Confusion Matrix for 10-fold cross validation in scikit learn 如何在 sklearn 中运行 10 折交叉验证后运行 SVC 分类器? - How to run SVC classifier after running 10-fold cross validation in sklearn? 使用 10 折交叉验证时 sklearn 的特征大小 - Feature size for sklearn when using 10 fold Cross Validation 在CNN分类的10倍交叉验证中,如何防止一个折叠的表现比其他9个表现差很多 - How to prevent one fold to perform a lot worse than the other 9 in 10-fold cross validation for CNN classification Python中10倍交叉验证代码中的错误 - Error in 10-fold cross validation code in Python 10折交叉验证并获得RMSE - 10-fold cross-validation and obtaining RMSE TensorFlow | 如何实现 10 折交叉验证? - TensorFlow | How I can implement 10-fold cross-validation? 应用分层 10 折交叉验证时,如何获取 python 中所有混淆矩阵的聚合 - How to get the aggregate of all the confusion matrix in python when Stratified 10 fold cross validation is applied 在 k 折交叉验证 sklearn 中使用 MAPE - Using MAPE in k fold cross validation sklearn 集成 10 倍模型 - Ensembling the 10-fold models
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM