简体   繁体   中英

Decision boundary logistic regression not correct

I know there are tons of questions out there, but it just won't work. I want to plot the decision boundary of logistic regression model. But my decision boundary plot is nowhere near as expected. I haven't drawn new data to verify the classifier, however, my loss looks as expected and is nicely converging. Please find my code below.

Edit1: I have now tried a couple solutions and ways to plot the decision boundary, and it always looks the same, so there must be something wrong with my parameters. Does someone have an idea what it might be?

Edit2: Out of frustration I randomly changed parameters of decision boundary to y = (-(w[2]+w[0].item()*x)/w[1]).T and now it's working. Does someone know why I had two twist them around?

This is what I will get. What am I doing wrong?

import numpy as np
import matplotlib.pyplot as plt

def sig(x):
    return 1/(1+np.exp(-x))

def N(mean, cov, n):
    return np.matrix(np.random.multivariate_normal(mean, cov, n))

# Data
n = 100
mean_1, cov_1 = [3,3], [[0.7,-0.3],[0.3,0.5]]
mean_2, cov_2 = [1,2], [[0.5,0.3],[0.3,0.5]]
X_1, X_2 = N(mean_1, cov_1, n), N(mean_2, cov_2, n) # (100, 2)
X = np.vstack((X_1, X_2)) # (200, 2)
Ts = np.vstack((np.zeros((n, 1)), np.ones((n, 1)))) # (200, 1)
Xs = np.hstack((X, np.ones((n*2,1)))) # (200, 3)

# Parameters
w = np.matrix(np.random.rand(np.size(Xs,1))).T # (3, 1)
alpha = 1e-2

# Train
loss = []
epochs = 10000
for i in range(epochs):
    Ys = sig(Xs@w)
    loss += [1/(n*2) * (-Ts.T@np.log(Ys) -(1-Ts.T)@np.log(1-Ys)).item()]
    grad = Xs.T@(Ys-Ts)
    w -= alpha/(n*2) * grad

# plot loss
plot_loss = False
if plot_loss:
    plt.plot(range(len(loss)), loss)
    plt.title("Convergence of loss")
    plt.xlabel('Epochs')
    plt.ylabel('Loss')

# plot data
plot_data = True
if plot_data:
    plt.scatter([X_1[:,0]],[X_1[:,1]], color="b")
    plt.scatter([X_2[:,0]],[X_2[:,1]], color="r")
    plt.title("Data")
    plt.xlabel('Feature 1')
    plt.ylabel('Feature 2')

# plot decision boundary
plot_db = True
if plot_db:
    x = np.linspace(-1, 6, 1000)
    y = (-(w[0]+w[1].item()*x)/w[2]).T
    plt.plot(x,y,"r")

plt.show()

The problem was the ones vector concatenated with the feature vector. Switching it with the feature vector and plotting the boundary as in the code above yields the correct boundary.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM