如何使用 Python 在 Keras 中 plot 對 ANN 的 ROC 曲線進行 10 倍交叉驗證？

Question

我只是想為 Keras 中 ANN 的 10 倍交叉驗證的所有 10 個實驗找到 ROC plot。 我堅持了一周，找不到解決方案。 有人可以幫忙嗎？ I have tried the code from the following link( https://scikit-learn.org/stable/auto_examples/model_selection/plot_roc_crossval.html ) from sklearn and wanted to use wrapper to use Keras model in sklearn but it shows errors. 我在 python 中的代碼：

    ## Creating NN in Keras
# Load libraries
import numpy as np
from keras import models
from keras import layers
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.datasets import make_classification

# Set random seed
np.random.seed(7)
#Create Function That Constructs Neural Network
# Create function returning a compiled network
def create_network():
    
    # Start neural network
    network = models.Sequential()

    # Add fully connected layer with a ReLU activation function
    network.add(layers.Dense(units=25, activation='relu', input_shape=(X.shape[1],)))

    # Add fully connected layer with a ReLU activation function
    network.add(layers.Dense(units=X.shape[1], activation='relu'))

    # Add fully connected layer with a sigmoid activation function
    network.add(layers.Dense(units=1, activation='sigmoid'))

    # Compile neural network
    network.compile(loss='binary_crossentropy', # Cross-entropy
                    optimizer='adam', # Root Mean Square Propagation
                    metrics=['accuracy']) # Accuracy performance metric
    
    # Return compiled network
    return network

###
#Wrap Function In KerasClassifier
# Wrap Keras model so it can be used by scikit-learn
neural_network = KerasClassifier(build_fn=create_network, 
                                 epochs=150, 
                                 batch_size=10, 
                                 verbose=0)


    import numpy as np
    import matplotlib.pyplot as plt
    
    from sklearn import svm, datasets
    from sklearn.metrics import auc
    from sklearn.metrics import plot_roc_curve
    from sklearn.model_selection import StratifiedKFold
    
    n_samples, n_features = X.shape
    
    # Add noisy features
    random_state = np.random.RandomState(0)
    X = np.c_[X, random_state.randn(n_samples, 200 * n_features)]
    
    # #############################################################################
    # Classification and ROC analysis
    
    # Run classifier with cross-validation and plot ROC curves
    cv = StratifiedKFold(n_splits=10)
    classifier = neural_network
    
    tprs = []
    aucs = []
    mean_fpr = np.linspace(0, 1, 100)
    
    fig, ax = plt.subplots()
    for i, (train, test) in enumerate(cv.split(X, y)):
        classifier.fit(X[train], y[train])
        viz = plot_roc_curve(classifier, X[test], y[test],
                             name='ROC fold {}'.format(i),
                             alpha=0.3, lw=1, ax=ax)
        interp_tpr = np.interp(mean_fpr, viz.fpr, viz.tpr)
        interp_tpr[0] = 0.0
        tprs.append(interp_tpr)
        aucs.append(viz.roc_auc)
    
    ax.plot([0, 1], [0, 1], linestyle='--', lw=2, color='r',
            label='Chance', alpha=.8)
    
    mean_tpr = np.mean(tprs, axis=0)
    mean_tpr[-1] = 1.0
    mean_auc = auc(mean_fpr, mean_tpr)
    std_auc = np.std(aucs)
    ax.plot(mean_fpr, mean_tpr, color='b',
            label=r'Mean ROC (AUC = %0.2f $\pm$ %0.2f)' % (mean_auc, std_auc),
            lw=2, alpha=.8)
    
    std_tpr = np.std(tprs, axis=0)
    tprs_upper = np.minimum(mean_tpr + std_tpr, 1)
    tprs_lower = np.maximum(mean_tpr - std_tpr, 0)
    ax.fill_between(mean_fpr, tprs_lower, tprs_upper, color='grey', alpha=.2,
                    label=r'$\pm$ 1 std. dev.')
    
    ax.set(xlim=[-0.05, 1.05], ylim=[-0.05, 1.05],
           title="Receiver operating characteristic example")
    ax.legend(loc="lower right")
    plt.show()


    **It shows the following error:**



    ---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-29-f10078491154> in <module>()
     40     viz = plot_roc_curve(classifier, X[test], y[test],
     41                          name='ROC fold {}'.format(i),
---> 42                          alpha=0.3, lw=1, ax=ax)
     43     interp_tpr = np.interp(mean_fpr, viz.fpr, viz.tpr)
     44     interp_tpr[0] = 0.0

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_plot/roc_curve.py in plot_roc_curve(estimator, X, y, sample_weight, drop_intermediate, response_method, name, ax, **kwargs)
    170     )
    171     if not is_classifier(estimator):
--> 172         raise ValueError(classification_error)
    173 
    174     prediction_method = _check_classifer_response_method(estimator,

ValueError: KerasClassifier should be a binary classifier

Answer 1

我有同樣的問題。 我發現這個鏈接非常有用。 https://www.kaggle.com/kanncaa1/roc-curve-with-k-fold-cv 。 我已針對我的情況對其進行了如下修改：

 seed = 7 np.random.seed(seed) tprs = [] aucs = [] mean_fpr = np.linspace(0, 1, 100) i = 1 fig, ax = plt.subplots() kfold = StratifiedKFold(n_splits=3, shuffle=True, random_state=seed) # for i, (train, test) in enumerate(cv.split(X_13, target)): for train, test in kfold.split(X_train, y_train): # create model model= Sequential() model.add(Dense(100, input_dim=X_train.shape[1], activation= 'relu',kernel_constraint=maxnorm(3))) model.add(Dropout(0.2)) model.add(Dense(80, activation = 'relu',kernel_constraint=maxnorm(3))) model.add(Dropout(0.2)) model.add(Dense(1, activation = 'sigmoid')) ##- compile model sgd = SGD(lr=0.1, momentum=0.8) model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy']) model.fit(X_train[train], y_train[train], epochs=100, batch_size=15,verbose=0) # evaluate the model y_pred_keras = model.predict_proba(X_train[test]).ravel() fpr, tpr, thresholds = roc_curve(y_train[test], y_pred_keras) tprs.append(interp(mean_fpr, fpr, tpr)) roc_auc = auc(fpr, tpr) aucs.append(roc_auc) plt.plot(fpr, tpr, lw=2, alpha=0.3, label='ROC fold %d (AUC = %0.2f)' % (i, roc_auc)) i= i+1 plt.plot([0,1],[0,1],linestyle = '--',lw = 2,color = 'black') mean_tpr = np.mean(tprs, axis=0) mean_auc = auc(mean_fpr, mean_tpr) plt.plot(mean_fpr, mean_tpr, color='blue', label=r'Mean ROC (AUC = %0.2f )' % (mean_auc),lw=2, alpha=1) plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('ROC') plt.legend(loc="lower right") plt.show()

希望它可以幫助！

Answer 2

我剛剛在這里回答了似乎是這篇文章的副本（除了變量名）。

不確定這是否是完全相同的副本，因為問題來自不同的帳戶，但似乎是這樣。 但這是我的答案的副本，以防其中一個作為副本關閉。

這是這個包裝庫中（可能）缺少的一個實現細節。

Sklearn 只是檢查估計器上是否存在名為_estimator_type的屬性，並設置為字符串值classifier 。 您可以通過查看 github 上的 sklearn 源代碼來了解這一點。

def is_classifier(estimator):
    """Return True if the given estimator is (probably) a classifier.
    Parameters
    ----------
    estimator : object
        Estimator object to test.
    Returns
    -------
    out : bool
        True if estimator is a classifier and False otherwise.
    """
    return getattr(estimator, "_estimator_type", None) == "classifier"

您需要做的就是手動將此屬性添加到您的分類器 object 中。

classifier = KerasClassifier(build_fn=create_network, 
                                 epochs=10, 
                                 batch_size=100, 
                                 verbose=2)

classifier._estimator_type = "classifier"

我已經對其進行了測試，並且可以正常工作。

如何使用 Python 在 Keras 中 plot 對 ANN 的 ROC 曲線進行 10 倍交叉驗證？

問題描述

2 個解決方案

解決方案1
0 2020-11-16 21:18:43

解決方案2
0 2020-11-16 21:41:15

如何使用 Python 在 Keras 中 plot 對 ANN 的 ROC 曲線進行 10 倍交叉驗證？

問題描述

2 個解決方案

解決方案1 0 2020-11-16 21:18:43

解決方案2 0 2020-11-16 21:41:15

解決方案1
0 2020-11-16 21:18:43

解決方案2
0 2020-11-16 21:41:15