一级分类

Question

I have more than 2500 samples on which static analysis has been performed, with more than 300 features extracted per sample.我有 2500 多个样本进行了静态分析，每个样本提取了 300 多个特征。

Among these samples, I have discriminated more than 10 APT class and my aim is to build, for each class, a one-class classifier.在这些样本中，我区分了 10 多个APT类，我的目标是为每个类构建一个单类分类器。

I'm using python scikit library for machine-learning, and in particular i'm facing with One-class SVM.我正在使用 python scikit 库进行机器学习，特别是我面临着一类 SVM。

First question: There exist some other good one-class classifier for this approach?第一个问题：这种方法还有其他一些好的单类分类器吗？

Second question: I have to come up with some metrics that can define a sort of "accuracy" of the classifier.第二个问题：我必须提出一些可以定义分类器“准确性”的指标。 Now I know that for one-class SVM the accuracy concept is not so well-define.现在我知道对于一类 SVM 来说，准确度概念并不是那么明确。 I report my code and my concept:我报告我的代码和我的概念：

import numpy as np
import pandas as pd
from sklearn import svm
from sklearn.model_selection import train_test_split


df = pd.read_csv('features_labeled_apt17.csv')

X = df.ix[:,1:341].values



X_train, X_test = train_test_split(X,test_size = 0.3,random_state = 42)



clf = svm.OneClassSVM(nu=0.1,kernel = "linear", gamma =0.1)
y_score = clf.fit(X_train)

pred = clf.predict(X_test)


print(pred)

These represents the output of the code:这些代表代码的输出：

[ 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1   1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1 -1   1  1  1  1  1  1  1  1  1  1  1  1  1 -1  1  1  1  1  1  1  1  1  1 -1  1   1  1  1  1  1  1  1 -1  1  1  1  1  1  1  1  1 -1  1  1  1
1 1  1  1  1   1  1  1  1  1 -1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  
1  1  1  1  1   1  1  1  1  1  1]

The 1 represent of course the well-labeled sample, while the -1 represent the wrong one. 1当然代表标记良好的样本，而-1代表错误的样本。

First: do you think this can be a good approach?第一：你认为这是一个好方法吗？ Second: For metrics, if I divide the total element in the testing set by the wrong labeled?第二：对于指标，如果我将测试集中的总元素除以错误的标签？

Answer 1

In my understanding in machine learning algorithms, your use case is not a good one to apply oneclass-SVM classifier.根据我对机器学习算法的理解，您的用例不是应用 oneclass-SVM 分类器的好用例。

Normally, oneclass-svm is used for Unsupervised Outlier Detection problems.通常，oneclass-svm 用于无监督的异常值检测问题。 Refer this page to see the implementation of oneclass-svm to detect outliers.请参阅此页面以查看 oneclass-svm 检测异常值的实现。

Just display your data-frame, I will find any new approach to solve your problem.只需显示您的数据框，我就会找到解决您问题的任何新方法。

一级分类

问题描述

1 个解决方案

解决方案1
-1 2018-02-12 08:34:53

一级分类

问题描述

1 个解决方案

解决方案1 -1 2018-02-12 08:34:53

解决方案1
-1 2018-02-12 08:34:53