Decision boundary logistic regression not correct

Question

I know there are tons of questions out there, but it just won't work. I want to plot the decision boundary of logistic regression model. But my decision boundary plot is nowhere near as expected. I haven't drawn new data to verify the classifier, however, my loss looks as expected and is nicely converging. Please find my code below.

Edit1: I have now tried a couple solutions and ways to plot the decision boundary, and it always looks the same, so there must be something wrong with my parameters. Does someone have an idea what it might be?

Edit2: Out of frustration I randomly changed parameters of decision boundary to y = (-(w[2]+w[0].item()*x)/w[1]).T and now it's working. Does someone know why I had two twist them around?

This is what I will get. What am I doing wrong?

import numpy as np
import matplotlib.pyplot as plt

def sig(x):
    return 1/(1+np.exp(-x))

def N(mean, cov, n):
    return np.matrix(np.random.multivariate_normal(mean, cov, n))

# Data
n = 100
mean_1, cov_1 = [3,3], [[0.7,-0.3],[0.3,0.5]]
mean_2, cov_2 = [1,2], [[0.5,0.3],[0.3,0.5]]
X_1, X_2 = N(mean_1, cov_1, n), N(mean_2, cov_2, n) # (100, 2)
X = np.vstack((X_1, X_2)) # (200, 2)
Ts = np.vstack((np.zeros((n, 1)), np.ones((n, 1)))) # (200, 1)
Xs = np.hstack((X, np.ones((n*2,1)))) # (200, 3)

# Parameters
w = np.matrix(np.random.rand(np.size(Xs,1))).T # (3, 1)
alpha = 1e-2

# Train
loss = []
epochs = 10000
for i in range(epochs):
    Ys = sig(Xs@w)
    loss += [1/(n*2) * (-Ts.T@np.log(Ys) -(1-Ts.T)@np.log(1-Ys)).item()]
    grad = Xs.T@(Ys-Ts)
    w -= alpha/(n*2) * grad

# plot loss
plot_loss = False
if plot_loss:
    plt.plot(range(len(loss)), loss)
    plt.title("Convergence of loss")
    plt.xlabel('Epochs')
    plt.ylabel('Loss')

# plot data
plot_data = True
if plot_data:
    plt.scatter([X_1[:,0]],[X_1[:,1]], color="b")
    plt.scatter([X_2[:,0]],[X_2[:,1]], color="r")
    plt.title("Data")
    plt.xlabel('Feature 1')
    plt.ylabel('Feature 2')

# plot decision boundary
plot_db = True
if plot_db:
    x = np.linspace(-1, 6, 1000)
    y = (-(w[0]+w[1].item()*x)/w[2]).T
    plt.plot(x,y,"r")

plt.show()

Answer 1

The problem was the ones vector concatenated with the feature vector. Switching it with the feature vector and plotting the boundary as in the code above yields the correct boundary.

Decision boundary logistic regression not correct

Question

1 answers

solution1
0 ACCPTED 2020-01-19 14:26:35

Decision boundary logistic regression not correct

Question

1 answers

solution1 0 ACCPTED 2020-01-19 14:26:35

solution1
0 ACCPTED 2020-01-19 14:26:35