简体   繁体   English

如何使用 Python 在 Keras 中 plot 对 ANN 的 ROC 曲线进行 10 倍交叉验证?

[英]How to plot the ROC curve for ANN for 10 fold Cross validation in Keras using Python?

I was just trying to find ROC plot for all the 10 experiments for 10 fold cross-validation for ANN in Keras.我只是想为 Keras 中 ANN 的 10 倍交叉验证的所有 10 个实验找到 ROC plot。 I got stuck with it for a week and can not find a solution.我坚持了一周,找不到解决方案。 Could anyone help with this?有人可以帮忙吗? I have tried the code from the following link( https://scikit-learn.org/stable/auto_examples/model_selection/plot_roc_crossval.html ) from sklearn and wanted to use wrapper to use Keras model in sklearn but it shows errors. I have tried the code from the following link( https://scikit-learn.org/stable/auto_examples/model_selection/plot_roc_crossval.html ) from sklearn and wanted to use wrapper to use Keras model in sklearn but it shows errors. My code in python:我在 python 中的代码:

    ## Creating NN in Keras
# Load libraries
import numpy as np
from keras import models
from keras import layers
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.datasets import make_classification

# Set random seed
np.random.seed(7)
#Create Function That Constructs Neural Network
# Create function returning a compiled network
def create_network():
    
    # Start neural network
    network = models.Sequential()

    # Add fully connected layer with a ReLU activation function
    network.add(layers.Dense(units=25, activation='relu', input_shape=(X.shape[1],)))

    # Add fully connected layer with a ReLU activation function
    network.add(layers.Dense(units=X.shape[1], activation='relu'))

    # Add fully connected layer with a sigmoid activation function
    network.add(layers.Dense(units=1, activation='sigmoid'))

    # Compile neural network
    network.compile(loss='binary_crossentropy', # Cross-entropy
                    optimizer='adam', # Root Mean Square Propagation
                    metrics=['accuracy']) # Accuracy performance metric
    
    # Return compiled network
    return network

###
#Wrap Function In KerasClassifier
# Wrap Keras model so it can be used by scikit-learn
neural_network = KerasClassifier(build_fn=create_network, 
                                 epochs=150, 
                                 batch_size=10, 
                                 verbose=0)


    import numpy as np
    import matplotlib.pyplot as plt
    
    from sklearn import svm, datasets
    from sklearn.metrics import auc
    from sklearn.metrics import plot_roc_curve
    from sklearn.model_selection import StratifiedKFold
    
    n_samples, n_features = X.shape
    
    # Add noisy features
    random_state = np.random.RandomState(0)
    X = np.c_[X, random_state.randn(n_samples, 200 * n_features)]
    
    # #############################################################################
    # Classification and ROC analysis
    
    # Run classifier with cross-validation and plot ROC curves
    cv = StratifiedKFold(n_splits=10)
    classifier = neural_network
    
    tprs = []
    aucs = []
    mean_fpr = np.linspace(0, 1, 100)
    
    fig, ax = plt.subplots()
    for i, (train, test) in enumerate(cv.split(X, y)):
        classifier.fit(X[train], y[train])
        viz = plot_roc_curve(classifier, X[test], y[test],
                             name='ROC fold {}'.format(i),
                             alpha=0.3, lw=1, ax=ax)
        interp_tpr = np.interp(mean_fpr, viz.fpr, viz.tpr)
        interp_tpr[0] = 0.0
        tprs.append(interp_tpr)
        aucs.append(viz.roc_auc)
    
    ax.plot([0, 1], [0, 1], linestyle='--', lw=2, color='r',
            label='Chance', alpha=.8)
    
    mean_tpr = np.mean(tprs, axis=0)
    mean_tpr[-1] = 1.0
    mean_auc = auc(mean_fpr, mean_tpr)
    std_auc = np.std(aucs)
    ax.plot(mean_fpr, mean_tpr, color='b',
            label=r'Mean ROC (AUC = %0.2f $\pm$ %0.2f)' % (mean_auc, std_auc),
            lw=2, alpha=.8)
    
    std_tpr = np.std(tprs, axis=0)
    tprs_upper = np.minimum(mean_tpr + std_tpr, 1)
    tprs_lower = np.maximum(mean_tpr - std_tpr, 0)
    ax.fill_between(mean_fpr, tprs_lower, tprs_upper, color='grey', alpha=.2,
                    label=r'$\pm$ 1 std. dev.')
    
    ax.set(xlim=[-0.05, 1.05], ylim=[-0.05, 1.05],
           title="Receiver operating characteristic example")
    ax.legend(loc="lower right")
    plt.show()


    **It shows the following error:**



    ---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-29-f10078491154> in <module>()
     40     viz = plot_roc_curve(classifier, X[test], y[test],
     41                          name='ROC fold {}'.format(i),
---> 42                          alpha=0.3, lw=1, ax=ax)
     43     interp_tpr = np.interp(mean_fpr, viz.fpr, viz.tpr)
     44     interp_tpr[0] = 0.0

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_plot/roc_curve.py in plot_roc_curve(estimator, X, y, sample_weight, drop_intermediate, response_method, name, ax, **kwargs)
    170     )
    171     if not is_classifier(estimator):
--> 172         raise ValueError(classification_error)
    173 
    174     prediction_method = _check_classifer_response_method(estimator,

ValueError: KerasClassifier should be a binary classifier
    

I had the same question.我有同样的问题。 I found this link very informative.我发现这个链接非常有用。 https://www.kaggle.com/kanncaa1/roc-curve-with-k-fold-cv . https://www.kaggle.com/kanncaa1/roc-curve-with-k-fold-cv I have modified it for my case as bellow:我已针对我的情况对其进行了如下修改:

 seed = 7 np.random.seed(seed) tprs = [] aucs = [] mean_fpr = np.linspace(0, 1, 100) i = 1 fig, ax = plt.subplots() kfold = StratifiedKFold(n_splits=3, shuffle=True, random_state=seed) # for i, (train, test) in enumerate(cv.split(X_13, target)): for train, test in kfold.split(X_train, y_train): # create model model= Sequential() model.add(Dense(100, input_dim=X_train.shape[1], activation= 'relu',kernel_constraint=maxnorm(3))) model.add(Dropout(0.2)) model.add(Dense(80, activation = 'relu',kernel_constraint=maxnorm(3))) model.add(Dropout(0.2)) model.add(Dense(1, activation = 'sigmoid')) ##- compile model sgd = SGD(lr=0.1, momentum=0.8) model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy']) model.fit(X_train[train], y_train[train], epochs=100, batch_size=15,verbose=0) # evaluate the model y_pred_keras = model.predict_proba(X_train[test]).ravel() fpr, tpr, thresholds = roc_curve(y_train[test], y_pred_keras) tprs.append(interp(mean_fpr, fpr, tpr)) roc_auc = auc(fpr, tpr) aucs.append(roc_auc) plt.plot(fpr, tpr, lw=2, alpha=0.3, label='ROC fold %d (AUC = %0.2f)' % (i, roc_auc)) i= i+1 plt.plot([0,1],[0,1],linestyle = '--',lw = 2,color = 'black') mean_tpr = np.mean(tprs, axis=0) mean_auc = auc(mean_fpr, mean_tpr) plt.plot(mean_fpr, mean_tpr, color='blue', label=r'Mean ROC (AUC = %0.2f )' % (mean_auc),lw=2, alpha=1) plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('ROC') plt.legend(loc="lower right") plt.show()

Hope it could help!希望它可以帮助!

I have just answered what seems to be the copy of this post (apart from variable names) here .我刚刚在这里回答了似乎是这篇文章的副本(除了变量名)。

Not sure whether this is the exact duplicate or not because the question comes from a different account but it seems like that.不确定这是否是完全相同的副本,因为问题来自不同的帐户,但似乎是这样。 But here is a copy of my answer in case one of these is closed as a duplicate.但这是我的答案的副本,以防其中一个作为副本关闭。


This is an implementational detail that is (probably) missing in this wrapper library.这是这个包装库中(可能)缺少的一个实现细节。

Sklearn simply checks whether an attribute called _estimator_type is present on the estimator and is set to string value classifier . Sklearn 只是检查估计器上是否存在名为_estimator_type的属性,并设置为字符串值classifier You can see that by looking into sklearn's source code on github.您可以通过查看 github 上的 sklearn 源代码来了解这一点。

def is_classifier(estimator):
    """Return True if the given estimator is (probably) a classifier.
    Parameters
    ----------
    estimator : object
        Estimator object to test.
    Returns
    -------
    out : bool
        True if estimator is a classifier and False otherwise.
    """
    return getattr(estimator, "_estimator_type", None) == "classifier"

All you need to do is to add this attribute to your classifier object manually.您需要做的就是手动将此属性添加到您的分类器 object 中。

classifier = KerasClassifier(build_fn=create_network, 
                                 epochs=10, 
                                 batch_size=100, 
                                 verbose=2)

classifier._estimator_type = "classifier"

I have tested it and it works.我已经对其进行了测试,并且可以正常工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM