简体   繁体   English

计算最佳成本函数 - SVM

[英]Compute the best cost function - SVM

I have to calculate the Cost Function for the following classification problem, using the SVM:我必须使用 SVM 计算以下分类问题的成本函数:

Training data:训练数据:

X1   X2   Y  
1.3  0.2  0
1.5  0.4  0
4.7  1.4  1
4.5  1.5  1
A. 1.6*x1 + 4*x2 - 5.6 = 0
B. 2.4*x1 + 4 * x2 - 7.2 = 0
C. 0.96*x1 + 4 * x2 - 4.8 = 0

How to calculate the Cost function for each decision boundaries above, to find the best?如何计算上面每个决策边界的成本函数,以找到最好的?

To calculate the "cost" of each given decision boundary, you have to calculate the predicted label for each data point.要计算每个给定决策边界的“成本”,您必须计算每个数据点的预测标签。 For example, for the decision boundary A and the first data point, you have to substitute x1 = 1.3 and x2 = 0.2 and then the predicted label is 1.6 * 1.3 + 4 * 0.2 - 5.6 = -2.72 .例如,对于决策边界 A 和第一个数据点,您必须替换x1 = 1.3x2 = 0.2 ,然后预测标签为1.6 * 1.3 + 4 * 0.2 - 5.6 = -2.72 Since you are dealing with binary label, normally the model (you're not training a model and the decision boundary is already given) should say something like "if (1.6*x1 + 4*x2 - 5.6) >= 0, then its label is 1" (or vice versa).由于您正在处理二进制标签,通常模型(您不是在训练模型并且已经给出决策边界)应该说“如果(1.6*x1 + 4*x2 - 5.6)>= 0,那么它的标签为 1"(反之亦然)。 Double check the given model.仔细检查给定的模型。 For example, let's assume the predicted label for the decision boundary A and the first data point is 0 (since the computed value is -2.72 < 0) then this is a true negative (since the predicted label and the given label are both 0 ).例如,假设决策边界 A 的预测标签和第一个数据点为0 (因为计算值是 -2.72 < 0),那么这是一个真负(因为预测标签和给定标签都是0 ) .

There are 4 cases in total: 1) If predicted label is 1 and the given label is 1, it's called "true positive".总共有 4 种情况: 1)如果预测标签为 1,给定标签为 1,则称为“真阳性”。 2) If predicted label is 0 and the given label is 0, it's called "true negative". 2)如果预测标签为0且给定标签为0,则称为“真负”。 3) If predicted label is 1 and the given label is 0, it's called "false positive". 3)如果预测标签为1而给定标签为0,则称为“误报”。 4) If predicted label is 0 and the given label is 1, it's called "false negative". 4)如果预测标签为0,给定标签为1,则称为“假阴性”。

Then, usually, the cost function is a function in term of the number of "true positive", "true negative", "false positive", "false negative".然后,通常,成本函数是“真阳性”、“真阴性”、“假阳性”、“假阴性”数量的函数。 Then, after you compute the predicted labels for all four given data points, you can calculate these numbers and hence the cost of the model.然后,在计算所有四个给定数据点的预测标签后,您可以计算这些数字,从而计算模型的成本。

PS In 99.9% of chance, cost function will be in term of "false positive", "false negative" only, but it could sometimes also depend on "true positive", "true negative" in some very rare cases. PS 在 99.9% 的机会中,成本函数将仅以“假阳性”、“假阴性”为术语,但在某些非常罕见的情况下,有时也可能取决于“真阳性”、“真阴性”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM