简体   繁体   English

Scikit 学习 SVC 决策函数和预测

[英]Scikit Learn SVC decision_function and predict

I'm trying to understand the relationship between decision_function and predict, which are instance methods of SVC ( http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html ).我试图了解decision_function 和predict 之间的关系,它们是SVC 的实例方法( http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html )。 So far I've gathered that decision function returns pairwise scores between classes.到目前为止,我已经收集到决策函数返回类之间的成对分数。 I was under the impression that predict chooses the class that maximizes its pairwise score, but I tested this out and got different results.我的印象是 predict 选择最大化其成对分数的类,但我对此进行了测试并得到了不同的结果。 Here's the code I was using to try and understand the relationship between the two.这是我用来尝试理解两者之间关系的代码。 First I generated the pairwise score matrix, and then I printed out the class that has maximal pairwise score which was different than the class predicted by clf.predict.首先我生成成对分数矩阵,然后我打印出具有最大成对分数的类,该类与 clf.predict 预测的类不同。

        result = clf.decision_function(vector)[0]
        counter = 0
        num_classes = len(clf.classes_)
        pairwise_scores = np.zeros((num_classes, num_classes))
        for r in xrange(num_classes):
            for j in xrange(r + 1, num_classes):
                pairwise_scores[r][j] = result[counter]
                pairwise_scores[j][r] = -result[counter]
                counter += 1

        index = np.argmax(pairwise_scores)
        class = index_star / num_classes
        print class
        print clf.predict(vector)[0]

Does anyone know the relationship between these predict and decision_function?有谁知道这些预测和决策函数之间的关系?

I don't fully understand your code, but let's go trough the example of the documentation page you referenced:我不完全理解你的代码,但让我们来看看你引用的文档页面的例子:

import numpy as np
X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
y = np.array([1, 1, 2, 2])
from sklearn.svm import SVC
clf = SVC()
clf.fit(X, y) 

Now let's apply both the decision function and predict to the samples:现在让我们将决策函数和预测应用于样本:

clf.decision_function(X)
clf.predict(X)

The output we get is:我们得到的输出是:

array([[-1.00052254],
       [-1.00006594],
       [ 1.00029424],
       [ 1.00029424]])
array([1, 1, 2, 2])

And that is easy to interpret: The desion function tells us on which side of the hyperplane generated by the classifier we are (and how far we are away from it).这很容易解释:desion 函数告诉我们我们在分类器生成的超平面的哪一侧(以及我们离它多远)。 Based on that information, the estimator then label the examples with the corresponding label.根据该信息,估计器然后用相应的标签标记示例。

For those interested, I'll post a quick example of the predict function translated from C++ ( here ) to python:对于那些感兴趣的人,我将发布一个从 C++( 此处)转换为 python 的predict函数的快速示例:

# I've only implemented the linear and rbf kernels
def kernel(params, sv, X):
    if params.kernel == 'linear':
        return [np.dot(vi, X) for vi in sv]
    elif params.kernel == 'rbf':
        return [math.exp(-params.gamma * np.dot(vi - X, vi - X)) for vi in sv]

# This replicates clf.decision_function(X)
def decision_function(params, sv, nv, a, b, X):
    # calculate the kernels
    k = kernel(params, sv, X)

    # define the start and end index for support vectors for each class
    start = [sum(nv[:i]) for i in range(len(nv))]
    end = [start[i] + nv[i] for i in range(len(nv))]

    # calculate: sum(a_p * k(x_p, x)) between every 2 classes
    c = [ sum(a[ i ][p] * k[p] for p in range(start[j], end[j])) +
          sum(a[j-1][p] * k[p] for p in range(start[i], end[i]))
                for i in range(len(nv)) for j in range(i+1,len(nv))]

    # add the intercept
    return [sum(x) for x in zip(c, b)]

# This replicates clf.predict(X)
def predict(params, sv, nv, a, b, cs, X):
    ''' params = model parameters
        sv = support vectors
        nv = # of support vectors per class
        a  = dual coefficients
        b  = intercepts 
        cs = list of class names
        X  = feature to predict       
    '''
    decision = decision_function(params, sv, nv, a, b, X)
    votes = [(i if decision[p] > 0 else j) for p,(i,j) in enumerate((i,j) 
                                           for i in range(len(cs))
                                           for j in range(i+1,len(cs)))]

    return cs[max(set(votes), key=votes.count)]

There are a lot of input arguments for predict and decision_function , but note that these are all used internally in by the model when calling predict(X) . predictdecision_function有很多输入参数,但请注意,在调用predict(X)时,这些都是模型内部使用的。 In fact, all of the arguments are accessible to you inside the model after fitting:事实上,在拟合后,您可以在模型内部访问所有参数:

# Create model
clf = svm.SVC(gamma=0.001, C=100.)

# Fit model using features, X, and labels, Y.
clf.fit(X, y)

# Get parameters from model
params = clf.get_params()
sv = clf.support_vectors
nv = clf.n_support_
a  = clf.dual_coef_
b  = clf._intercept_
cs = clf.classes_

# Use the functions to predict
print(predict(params, sv, nv, a, b, cs, X))

# Compare with the builtin predict
print(clf.predict(X))

There's a really nice Q&A for the multi-class one-vs-one scenario at datascience.sx:在 datascience.sx 上有一个关于多类一对一场景的非常好的问答

Question问题

I have a multiclass SVM classifier with labels 'A', 'B', 'C', 'D'.我有一个多类 SVM 分类器,标签为“A”、“B”、“C”、“D”。

This is the code I'm running:这是我正在运行的代码:

 >>>print clf.predict([predict_this]) ['A'] >>>print clf.decision_function([predict_this]) [[ 185.23220833 43.62763596 180.83305074 -93.58628288 62.51448055 173.43335293]]

How can I use the output of decision function to predict the class (A/B/C/D) with the highest probability and if possible, it's value?如何使用决策函数的输出来预测具有最高概率的类别(A/B/C/D),如果可能的话,它的价值? I have visited https://stackoverflow.com/a/20114601/7760998 but it is for binary classifiers and could not find a good resource which explains the output of decision_function for multiclass classifiers with shape ovo (one-vs-one).我已经访问过https://stackoverflow.com/a/20114601/7760998但它是针对二元分类器的,并且找不到一个很好的资源来解释具有形状 ovo(一对一)的多类分类器的决策函数的输出。

Edit:编辑:

The above example is for class 'A'.上面的示例适用于“A”类。 For another input the classifier predicted 'C' and gave the following result in decision_function对于另一个输入,分类器预测“C”并在决策函数中给出以下结果

[[ 96.42193513 -11.13296606 111.47424538 -88.5356536 44.29272494 141.0069203 ]]

For another different input which the classifier predicted as 'C' gave the following result from decision_function,对于分类器预测为“C”的另一个不同输入,decision_function 给出了以下结果,

 [[ 290.54180354 -133.93467605 116.37068951 -392.32251314 -130.84421412 284.87653043]]

Had it been ovr (one-vs-rest), it would become easier by selecting the one with higher value, but in ovo (one-vs-one) there are (n * (n - 1)) / 2 values in the resulting list.如果是 ovr (one-vs-rest),选择具有更高值的那个会变得更容易,但是在 ovo (one-vs-one) 中有(n * (n - 1)) / 2值结果列表。

How to deduce which class would be selected based on the decision function?如何根据决策函数推断将选择哪个类?

Answer回答

Your link has sufficient resources, so let's go through:你的链接有足够的资源,让我们来看看:

When you call decision_function(), you get the output from each of the pairwise classifiers (n*(n-1)/2 numbers total).当您调用 decision_function() 时,您将获得每个成对分类器的输出(总共 n*(n-1)/2 个数字)。 See pages 127 and 128 of "Support Vector Machines for Pattern Classification".请参阅“用于模式分类的支持向量机”的第 127 和 128 页。

Click on the "page 127 and 128" link (not shown here, but in the Stackoverflow answer).单击“第 127 页和第 128 页”链接(此处未显示,但在 Stackoverflow 答案中)。 You should see:你应该看到:

在此处输入图片说明

  • Python's SVM implementation uses one-vs-one. Python 的 SVM 实现使用一对一。 That's exactly what the book is talking about.这正是本书要讨论的内容。
  • For each pairwise comparison, we measure the decision function对于每个成对比较,我们测量决策函数
  • The decision function is the just the regular binary SVM decision boundary决策函数就是正则二元 SVM 决策边界

What does that to do with your question?这和你的问题有什么关系?

  • clf.decision_function() will give you the $D$ for each pairwise comparison clf.decision_function() 将为您提供每个成对比较的 $D$
  • The class with the most votes win得票最多的班级获胜

For instance,例如,

[[ 96.42193513 -11.13296606 111.47424538 -88.5356536 44.29272494 141.0069203 ]] [[ 96.42193513 -11.13296606 111.47424538 -88.5356536 44.29272494 141.0069203 ]]

is comparing:正在比较:

[AB, AC, AD, BC, BD, CD] [AB、AC、AD、BC、BD、CD]

We label each of them by the sign.我们用符号标记它们中的每一个。 We get:我们得到:

[A, C, A, C, B, C] [A、C、A、C、B、C]

For instance, 96.42193513 is positive and thus A is the label for AB.例如,96.42193513 是正数,因此 A 是 AB 的标签。

Now we have three C, C would be your prediction.现在我们有三个 C,C 将是您的预测。 If you repeat my procedure for the other two examples, you will get Python's prediction.如果您对其他两个示例重复我的过程,您将得到 Python 的预测。 Try it!试试吧!

When you call decision_function() , you get the output from each of the pairwise classifiers (n*(n-1)/2 numbers total).当您调用decision_function() ,您将获得每个成对分类器的输出(总共 n*(n-1)/2 个数字)。 See pages 127 and 128 of "Support Vector Machines for Pattern Classification" .请参阅“用于模式分类的支持向量机”的第 127 和 128 页

Each classifier puts in a vote as to what the correct answer is (based on the sign of the output of that classifier);每个分类器对正确答案进行投票(基于该分类器输出的符号); predict() returns the class with the most votes. predict()返回投票最多的类。

They probably have a bit complicated mathematical relation.他们可能有一些复杂的数学关系。 But if you use the decision_function in LinearSVC classifier, the relation between those two will be more clear!但是,如果你使用decision_functionLinearSVC分类,这两个之间的关系会更加清晰! Because then decision_function will give you scores for each class label (not same as SVC) and predict will give the class with the best score.因为那么decision_function将为您提供每个类标签的分数(与SVC 不同),而predict 将为您提供最佳分数的类。

Predict() follows a pairwise voting scheme which returns the class with most votes over all pairwise comparisons. Predict() 遵循成对投票方案,该方案返回在所有成对比较中得票最多的类别。 When two classes score the same, the class with the lowest index is returned.当两个班级得分相同时,返回索引最低的班级。

Below a Python example that applies this voting scheme to the (n*(n-1)/2 pairwise scores as returned by a one-versus-one decision_function().下面是一个 Python 示例,该示例将此投票方案应用于由一对一决策函数()返回的 (n*(n-1)/2 成对分数)。

from sklearn import svm
from sklearn import datasets
from numpy import argmax, zeros
from itertools import combinations

# do pairwise comparisons, return class with most +1 votes
def ovo_vote(classes, decision_function):
    combos = list(combinations(classes, 2))
    votes = zeros(len(classes))
    for i in range(len(decision_function[0])):
        if decision_function[0][i] > 0:
            votes[combos[i][0]] = votes[combos[i][0]] + 1
        else:
            votes[combos[i][1]] = votes[combos[i][1]] + 1
    winner = argmax(votes)
    return classes[winner]

# load the digits data set
digits = datasets.load_digits()

X, y = digits.data, digits.target

# set the SVC's decision function shape to "ovo"
estimator = svm.SVC(gamma=0.001, C=100., decision_function_shape='ovo')

# train SVC on all but the last digit
estimator.fit(X.data[:-1], y[:-1])

# print the value of the last digit
print("To be classified digit: ", y[-1:][0])

# print the predicted class
pred = estimator.predict(X[-1:])
print("Perform classification using predict: ", pred[0])

# get decision function
df = estimator.decision_function(X[-1:])

# print the decision function itself
print("Decision function consists of",len(df[0]),"elements:")
print(df)

# get classes, here, numbers 0 to 9
digits = estimator.classes_

# print which class has most votes
vote = ovo_vote(digits, df)
print("Perform classification using decision function: ", vote)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM