简体   繁体   English

用高斯核绘制逻辑回归的决策曲线

[英]plot a decision curve for logistic regression with gaussian Kernel

I have tried using logistic regression with polynomial feature, and fortunately its working fine for me and also I am able to plot the decision curve. 我尝试将Logistic回归与多项式特征一起使用,幸运的是,它对我来说工作正常,而且我能够绘制决策曲线。 I have used map_feature function for polynomial features. 我已将map_feature函数用于多项式特征。 (I referred Prof Andrew's notes on logistic regression with regularization) : http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=MachineLearning&doc=exercises/ex5/ex5.html (我引用了安德鲁教授关于带正则化的逻辑回归的注释): http : //openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=MachineLearning&doc=exercises/ex5/ex5.html

Now I am trying to achieve the same using Gaussian Kernel instead of taking polynomial features. 现在,我试图使用高斯核而不是采用多项式特征来实现相同的目标。 Fortunately my cost function (j_theta) works fine and decreases after every iteration and I get my final theta value. 幸运的是,我的成本函数(j_theta)可以正常工作,并且在每次迭代后都会减少,我得到了最终的theta值。 The problem that I face now is HOW DO I PLOT THE DECISION BOUNDARY here 我现在面临的问题是如何在此处绘制决策边界

I am using Octave to develop the algorithms and plot the graphs..

Below is the details to my data set size 以下是我的数据集大小的详细信息

Original Data set: 原始数据集:

Data Set (x):  [20*3] where the first column is the intercept or the bias column

1.00  2.0000   1.0000
1.00  3.0000   1.0000
1.00  4.0000   1.0000
1.00  5.0000   2.0000
1.00  5.0000   3.0000
 .
 .
 .

Data set with new features after implementation of Gaussian Kernal 实施高斯内核后具有新功能的数据集

Data set (f) : [20*21] the first column is the intercept column with all values as 1

1.0000e+000  1.0000e+000  6.0653e-001  1.3534e-001  6.7379e-003 . . . . . . . . 
1.0000e+000  6.0653e-001  1.0000e+000  6.0653e-001  8.2085e-002 . . . . . . . .
1.0000e+000  1.3534e-001  6.0653e-001  1.0000e+000  3.6788e-001
1.0000e+000  6.7379e-003  8.2085e-002  3.6788e-001  1.0000e+000
.               .
.               . 
.               .
.               .
.               .

The cost Function graph that I get after applying gradient descent on my new featured data set (f) is : 在对新的特征数据集(f)应用梯度下降后获得的成本函数图为:

在此处输入图片说明

Hence I get my new theta value: 因此,我得到了新的theta值:

theta: [21*1]
 3.8874e+000
 1.1747e-001
 3.5931e-002
-8.5937e-005
-1.2666e-001
-1.0584e-001
 .
 .
 .

The problem that I face now is how do I construct my decision curve upon my original dataset having new features data set and theta value. 我现在面临的问题是如何在具有新要素数据集和theta值的原始数据集上构建决策曲线。 I have no clue how do I proceed. 我不知道如何进行。

I would be glad if I get some clue, or tutorials, or link that could help me solve my problem. 如果能获得一些可以帮助我解决问题的线索,教程或链接,我将感到非常高兴。

Appreciate you help . 感谢您的帮助。 Thanks 谢谢

The referenced Andrew's note actually contains a very good example of how to draw the decision boundary. 引用的安德鲁笔记实际上包含了一个很好的示例,说明了如何划定决策边界。 Also see this stackoverflow post. 另请参阅 stackoverflow帖子。 The basic steps to follow are as below: 遵循的基本步骤如下:

  1. Choose a resolution based on the range of your input data, or feature vector X . 根据输入数据或特征向量X的范围选择分辨率。
  2. Create a grid made by every points within the resolution. 创建一个由分辨率内每个点组成的网格。
  3. Visit each point in the grid, using your learned logistic regression model, predict the score. 使用您学习的逻辑回归模型访问网格中的每个点,预测分数。
  4. Use the score as the Z variable (the height on the contour plot), plot the contour curve. 将分数用作Z变量(轮廓图上的高度),绘制轮廓曲线。

In the sample code below, we assume a 2d feature space each ranges from -1 to 200. We choose a step size of 1.5 and then for each point in the grid, we call the model predictor -- map_feature(u,v) x theta to get the point score. 在下面的示例代码中,我们假设2d特征空间的范围为-1至200。我们选择步长为1.5,然后对于网格中的每个点,我们将其称为模型predictor map_feature(u,v) x theta来获得分数。 Finally the plot is drawn by calling contour function in matlab. 最后,通过在Matlab中调用contour函数来绘制图。

Plotting the decision boundary here will be trickier than plotting the best-fit curve in linear regression. 在这里绘制决策边界比在线性回归中绘制最佳拟合曲线要复杂得多。 You will need to plot the $\\theta^T x = 0$ line implicity, by plotting a contour. 您将需要通过绘制轮廓线来隐式绘制$ \\ theta ^ T x = 0 $线。 This can be done by evaluating $\\theta^Tx$ over a grid of points representing the original $u$ and $v$ inputs, and then plotting the line where $\\theta^Tx$ evaluates to zero. 这可以通过在代表原始$ u $和$ v $输入的点网格上评估$ \\ theta ^ Tx $,然后绘制$ \\ theta ^ Tx $计算为零的线来完成。 The plot implementation for Matlab/Octave is given below. 下面给出了Matlab / Octave的绘图实现。

% Define the ranges of the grid
u = linspace(-1, 1.5, 200);
v = linspace(-1, 1.5, 200);

% Initialize space for the values to be plotted
z = zeros(length(u), length(v));

% Evaluate z = theta*x over the grid
for i = 1:length(u)
    for j = 1:length(v)
        % Notice the order of j, i here!
        z(j,i) = map_feature(u(i), v(j))*theta;
    end
end

% Because of the way that contour plotting works
% in Matlab, we need to transpose z, or
% else the axis orientation will be flipped!
z = z'
% Plot z = 0 by specifying the range [0, 0]
contour(u,v,z, [0, 0], 'LineWidth', 2)

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM