一類svm的測試集的100％錯誤率

Question

我正在嘗試檢測異常圖像。 但是我從模型中得到了奇怪的結果。

我已經用cv2讀入圖像，將它們展平為1d數組，然后將它們轉換為pandas數據框，然后將其輸入到SVM中。

import numpy as np
import cv2
import glob
import pandas as pd
import sys, os
import matplotlib.pyplot as plt
from sklearn import svm
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn import *
import seaborn as sns`

加載標簽和文件

labels_wt = np.loadtxt("labels_wt.txt", delimiter="\t", dtype="str")
files_wt = np.loadtxt("files_wt.txt", delimiter="\t", dtype="str")`

加載並展平圖像

wt_images_tmp = [cv2.imread(file) for file in files_wt]
wt_images = [image.flatten() for image in wt_images_tmp]
tmp3 = np.array(wt_images)
mutant_images_tmp = [cv2.imread(file) for file in files_mut]
mutant_images = [image.flatten() for image in mutant_images_tmp]
tmp4 = np.array(mutant_images)


X = pd.DataFrame(tmp3) #load the wild-type images
y = pd.Series(labels_wt)
X_train, X_test, y_train, y_test= train_test_split(X, y, test_size=0.2, random_state=42) 
X_outliers = pd.DataFrame(tmp4)
clf = svm.OneClassSVM(nu=0.15, kernel="rbf", gamma=0.0001)
clf.fit(X_train)

然后，根據oneclass SVM的sklearn教程評估結果。

y_pred_train = clf.predict(X_train)
y_pred_test = clf.predict(X_test)
y_pred_outliers = clf.predict(X_outliers)
n_error_train = y_pred_train[y_pred_train == -1].size
n_error_test = y_pred_test[y_pred_test == -1].size
n_error_outliers = y_pred_outliers[y_pred_outliers == 1].size

print(n_error_train / len(y_pred_train))
print(float(n_error_test) / float(len(y_pred_test)))
print(n_error_outliers / len(y_pred_outliers))`

我在訓練集上的錯誤率是可變的（10-30％），但是在測試集上，它們從未低於100％。 我做錯了嗎？

Answer 1

我的猜測是，您正在設置random_state = 42 ，這train_test_split您的train_test_split始終具有相同的分割模式。 您可以在此答案中了解更多信息。 不要指定任何狀態，然后再次運行代碼，因此：

X_train, X_test, y_train, y_test= train_test_split(X, y, test_size=0.2)

這將顯示不同的結果。 一旦確定此方法有效，請確保隨后進行交叉驗證，可能使用k折驗證。 讓我們知道是否有幫助。

一類svm的測試集的100％錯誤率

問題描述

1 個解決方案

解決方案1
0 2019-09-02 12:50:27

一類svm的測試集的100％錯誤率

問題描述

1 個解決方案

解決方案1 0 2019-09-02 12:50:27

解決方案1
0 2019-09-02 12:50:27