简体   繁体   English

如何找到神经网络的假阳性率和假阴性率?

[英]How do I find the false positive and false negative rates for a neural network?

删除了文本,因为我还没有找到解决方案,所以我意识到我不希望其他人窃取有效的第一部分。

As you already loaded the confusion_matrix from scikit.learn , you can use this one: 当您已经从scikit.learn加载了confusion_matrix时,可以使用以下代码:

cutoff = 0.5
y_predict = model.predict(x_test)                              
y_pred_classes = np.zeros_like(y_pred)    # initialise a matrix full with zeros
y_pred_classes[y_pred > cutoff] = 1

y_test_classes = np.zeros_like(y_pred)
y_test_classes[y_test > cutoff] = 1
print(confusion_matrix(y_test_classes, y_pred_classes)

the confusion matrix always is ordered like this: 混乱矩阵总是这样排列的:

True Positives    False negatives
False Positives   True negatives

for tn and so on you can run this: 对于tn等,您可以运行以下命令:

tn, fp, fn, tp = confusion_matrix(y_test_classes, y_pred_classes).ravel()
(tn, fp, fn, tp)

Your input to confusion_matrix must be an array of int not one hot encodings. 您对confusion_matrix的输入必须是一个整数数组,而不是一个热编码。

# Predicting the Test set results
y_pred = model.predict(X_test)
y_pred = (y_pred > 0.5)
matrix = metrics.confusion_matrix(y_test.argmax(axis=1), y_pred.argmax(axis=1))

Below output would have come in that manner so by giving a probability threshold .5 will transform this to Binary. 低于输出将以这种方式出现,因此通过给出概率阈值.5会将其转换为二进制。

output(y_pred): 输出(y_pred):

[0.87812372 0.77490434 0.30319547 0.84999743]

The sklearn.metrics.accuracy_score(y_true, y_pred) method defines y_pred as: sklearn.metrics.accuracy_score(y_true,y_pred)方法将y_pred定义为:

y_pred : 1d array-like, or label indicator array / sparse matrix. y_pred:类似于1d数组,或标签指示符数组/稀疏矩阵。 Predicted labels, as returned by a classifier. 预测标签,由分类器返回。

Which means y_pred has to be an array of 1's or 0's (predicated labels). 这意味着y_pred必须为1或0的数组(谓词标签)。 They should not be probabilities. 他们不应该是概率。

the root cause of your error is a theoretical and not computational issue: you are trying to use a classification metric (accuracy) in a regression (ie numeric prediction) model (Neural Logistic Model), which is meaningless. 错误的根本原因是理论上的而不是计算上的问题:您正在尝试在无意义的回归(即数值预测)模型(神经逻辑模型)中使用分类指标(准确性)。

Just like the majority of performance metrics, accuracy compares apples to apples (ie true labels of 0/1 with predictions again of 0/1); 就像大多数性能指标一样,准确性将苹果与苹果进行了比较(即,真实标签为0/1,而预测值再次为0/1); so, when you ask the function to compare binary true labels (apples) with continuous predictions (oranges), you get an expected error, where the message tells you exactly what the problem is from a computational point of view: 因此,当您要求函数将二进制真实标签(苹果)与连续预测(橙色)进行比较时,会出现预期的错误,该错误消息从计算的角度确切地告诉您问题出在哪里:

Classification metrics can't handle a mix of binary and continuous target

Despite that the message doesn't tell you directly that you are trying to compute a metric that is invalid for your problem (and we shouldn't actually expect it to go that far), it is certainly a good thing that scikit-learn at least gives you a direct and explicit warning that you are attempting something wrong; 尽管该消息并没有直接告诉您您正在尝试计算对您的问题无效的指标(并且我们实际上不应期望它走得那么远),但是scikit-learning无疑是一件好事至少会给您直接和明确的警告,表示您尝试做错事; this is not necessarily the case with other frameworks - see for example the behavior of Keras in a very similar situation, where you get no warning at all, and one just ends up complaining for low "accuracy" in a regression setting... 在其他框架上并不一定是这种情况-例如,在非常相似的情况下,看到Keras的行为,您根本不会得到任何警告,而最终只是在回归设置中抱怨“准确性”低下...

from keras import models
from keras.layers import Dense, Dropout
from keras.utils import to_categorical
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from keras.models import Sequential
from keras.layers import Dense, Activation
from sklearn.cross_validation import  train_test_split
from sklearn import metrics
from sklearn.cross_validation import KFold, cross_val_score
from sklearn.preprocessing import StandardScaler


# read the csv file and convert into arrays for the machine to process
df = pd.read_csv('dataset_ori.csv')
dataset = df.values

# split the dataset into input features and the feature to predict
X = dataset[:,0:7]
Y = dataset[:,7]

# Splitting into Train and Test Set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(dataset,
                                                    response,
                                                    test_size = 0.2,
                                                    random_state = 0)

# Initialising the ANN
classifier = Sequential()

# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 10, kernel_initializer = 'uniform', activation = 'relu', input_dim =7 ))
model.add(Dropout(0.5))
# Adding the second hidden layer
classifier.add(Dense(units = 10, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dropout(0.5))
# Adding the output layer
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))

# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])

# Fitting the ANN to the Training set
classifier.fit(X_train, y_train, batch_size = 10, epochs = 20)

# Train model
scaler = StandardScaler()
classifier.fit(scaler.fit_transform(X_train.values), y_train)

# Summary of neural network
classifier.summary()

# Predicting the Test set results & Giving a threshold probability
y_prediction = classifier.predict_classes(scaler.transform(X_test.values))
print ("\n\naccuracy" , np.sum(y_prediction == y_test) / float(len(y_test)))
y_prediction = (y_prediction > 0.5)




## EXTRA: Confusion Matrix Visualize
from sklearn.metrics import confusion_matrix,accuracy_score
cm = confusion_matrix(y_test, y_pred) # rows = truth, cols = prediction
df_cm = pd.DataFrame(cm, index = (0, 1), columns = (0, 1))
plt.figure(figsize = (10,7))
sn.set(font_scale=1.4)
sn.heatmap(df_cm, annot=True, fmt='g')
print("Test Data Accuracy: %0.4f" % accuracy_score(y_test, y_pred))

#Let's see how our model performed
from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何训练带有负和正元素作为第一层输入的卷积神经网络? - How do I train the Convolutional Neural Network with negative and positive elements as the input of the first layer? 如何获取 False Positive 和 False Negative 的图像名称 - How to get image names of False Positive and False Negative 如何计算误报率 (FPR) 和误报率百分比? - How to compute false positive rate (FPR) and False negative rate percantage? Scikit-learn:如何获取True Positive、True Negative、False Positive和False Negative - Scikit-learn: How to obtain True Positive, True Negative, False Positive and False Negative True positive, False positive, False Negative 计算数据帧 python - True positive, False positive, False Negative calculation data frame python 如何从多类分类的混淆矩阵中提取假阳性,假阴性 - How to extract False Positive, False Negative from a confusion matrix of multiclass classification 如何查看混淆矩阵中标记为假阳性和假阴性的行 - How to view the rows marked as False Positive and False Negative from confusion matrix filecmp.cmp()什么时候返回假阳性或假阴性? - When will filecmp.cmp() return a false positive or false negative? 如何使用Keras构建和训练的神经网络处理错误的预测? - How to handle false predictions in a neural network built and trained with Keras? 有没有办法用已知的真阳性、真阴性、假阳性和假阴性来绘制混淆矩阵? - Is there a way to draw confusion matrix with known True Positive, True Negative, False Positive and False Negative?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM