简体   繁体   中英

Looking for a way (ideally in Python) to obtain a second parameter value when evaluating the result from equating 2 function calls

I have two function, g1(x,y) and g2(x,y) that return a float

eg, g1(1,2) --> returns 0.345
g2(1,2) --> returns 0.453

Now, for plotting a decision boundary, I want to satisfy:
g2(x,y) == g1(x,y) ,
or alternatively rearranged to:
g1(x,y) - g2(x,y) == 0

If I generate a range of x values 1,2,3,4,5 , how can I find the corresponding y values that yield g1(x,y) - g2(x,y) == 0 ?

I have really no idea how to do this and would appreciate any ideas. Do you think scipy.optimize.minimize would be a good approach? If so, how exactly would I do it (I tried and failed with the syntax).

Thanks for your help!

EDIT:

You asked for the equations of g1() and g2(), here they are :)

$ \\Rightarrow g_1(\\pmb{x}) = \\pmb{x}^{\\,t} - \\frac{1}{2} \\Sigma_1^{-1} \\pmb{x} + \\bigg( \\Sigma_1^{-1} \\pmb{\\mu} {\\,1}\\bigg)^t + \\bigg( -\\frac{1}{2} \\pmb{\\mu} {\\,1}^{\\,t} \\Sigma_{1}^{-1} \\pmb{\\mu}_{\\,1} -\\frac{1}{2} ln(|\\Sigma_1|)\\bigg) \\ \\quad g_2(\\pmb{x}) = \\pmb{x}^{\\,t} - \\frac{1}{2} \\Sigma_2^{-1} \\pmb{x} + \\bigg( \\Sigma_2^{-1} \\pmb{\\mu} {\\,2}\\bigg)^t + \\bigg( -\\frac{1}{2} \\pmb{\\mu} {\\,2}^{\\,t} \\Sigma_{2}^{-1} \\pmb{\\mu}_{\\,2} -\\frac{1}{2} ln(|\\Sigma_2|)\\bigg) $

(hm, somehow the Latex is not working, I will upload it as image): 在此输入图像描述

And this is how I implemented them:

def discriminant_function(x_vec, cov_mat, mu_vec):
    """
    Calculates the value of the discriminant function for a dx1 dimensional
    sample given covariance matrix and mean vector.

    Keyword arguments:
        x_vec: A dx1 dimensional numpy array representing the sample.
        cov_mat: numpy array of the covariance matrix.
        mu_vec: dx1 dimensional numpy array of the sample mean.

    Returns a float value as result of the discriminant function.

    """
    W_i = (-1/2) * np.linalg.inv(cov_mat)
    assert(W_i.shape[0] > 1 and W_i.shape[1] > 1), 'W_i must be a matrix'

    w_i = np.linalg.inv(cov_mat).dot(mu_vec)
    assert(w_i.shape[0] > 1 and w_i.shape[1] == 1), 'w_i must be a column vector'

    omega_i_p1 = (((-1/2) * (mu_vec).T).dot(np.linalg.inv(cov_mat))).dot(mu_vec)
    omega_i_p2 = (-1/2) * np.log(np.linalg.det(cov_mat))
    omega_i = omega_i_p1 - omega_i_p2
    assert(omega_i.shape == (1, 1)), 'omega_i must be a scalar'

    g = ((x_vec.T).dot(W_i)).dot(x_vec) + (w_i.T).dot(x_vec) + omega_i
    return float(g)

And for classifying the data I wrote:

import operator

def classify_data(x_vec, g, mu_vecs, cov_mats):
    """
    Classifies an input sample into 1 out of x classes determined by
    maximizing the discriminant function g_i().

    Keyword arguments:
        x_vec: A dx1 dimensional numpy array representing the sample.
        g: The discriminant function.
        mu_vecs: A list of mean vectors as input for g.
        cov_mats: A list of covariance matrices as input for g.

    Returns a tuple (g_i()_value, class label).

    """
    assert(len(mu_vecs) == len(cov_mats)), 'Number of mu_vecs and cov_mats must be equal.'

    g_vals = []
    for m,c in zip(mu_vecs, cov_mats): 
        g_vals.append(g(x_vec, mu_vec=m, cov_mat=c))

    max_index, max_value = max(enumerate(g_vals), key=operator.itemgetter(1))
    return (max_value, max_index + 1)

And the code works so far for classifying, eg,

import prettytable

classification_dict, error = empirical_error(all_samples, [1,2], classify_data, [discriminant_function,\
        [mu_est_1, mu_est_2],
        [cov_est_1, cov_est_2]])

labels_predicted = ['w{} (predicted)'.format(i) for i in [1,2]]
labels_predicted.insert(0,'training dataset')

train_conf_mat = prettytable.PrettyTable(labels_predicted)
for i in [1,2]:
    a, b = [classification_dict[i][j] for j in [1,2]]
    # workaround to unpack (since Python does not support just '*a')
    train_conf_mat.add_row(['w{} (actual)'.format(i), a, b])
print(train_conf_mat)
print('Empirical Error: {:.2f} ({:.2f}%)'.format(error, error * 100))


+------------------+----------------+----------------+
| training dataset | w1 (predicted) | w2 (predicted) |
+------------------+----------------+----------------+
|   w1 (actual)    |       49       |       1        |
|   w2 (actual)    |       1        |       49       |
+------------------+----------------+----------------+
Empirical Error: 0.02 (2.00%)

For a simple dataset like this:

在此输入图像描述

EDIT:

For a simple case where the covariances are equal (linear decision boundary), I was able to use the fsolve function:

from scipy.optimize import fsolve
x = list(np.arange(-2, 6, 0.1))
y = [fsolve(lambda y: discr_func(i, y, cov_mat=cov_est_1, mu_vec=mu_est_1) - \
                 discr_func(i, y, cov_mat=cov_est_2, mu_vec=mu_est_2), 0) for i in x]

http://oi62.tinypic.com/10r1zx4.jpg

However, it doesn't work for quadratic solutions, I get

/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/scipy/optimize/minpack.py:236: RuntimeWarning: The iteration is not making good progress, as measured by the 
  improvement from the last five Jacobian evaluations.
  warnings.warn(msg, RuntimeWarning)

Any tips or alternatives?

EDIT2:

I was able to solve it via from scipy.optimize.bisect (analog to fsolve). The results look to be "correct" - I solved the equation for a simpler case where the decision boundary is a linear function (x2 = 3-x1), and when I used bisect on it, it calculated exact results for eg, x1 = 3 and x2 = 3.

Anyway, here are the results for the quadratic function (I estimated parameters via Maximum Likelihood Estimate here) and the linear cases with equal covariances! Thanks so much for your time and help!

在此输入图像描述

For

from matplotlib import pyplot as plt
import numpy as np
import scipy.optimize

x = np.arange(-6,6,0.1)
true_y = [true_dec_bound(x1) for x1 in x]



for i in [50,1000,10000]:

    # compute boundary for MLE estimate
    y_est = []
    for j in x:
        y_est.append(scipy.optimize.bisect(lambda y: discr_func(j, y, cov_mat=cov1_ests[i], mu_vec=mu1_ests[i]) - \
                      discr_func(j, y, cov_mat=cov2_ests[i], mu_vec=mu2_ests[i]), -10, 10))
    y_est = [float(i) for i in y_est]

    # plot data
    f, ax = plt.subplots(figsize=(7, 7))
    plt.ylabel('$x_2$', size=20)
    plt.xlabel('$x_1$', size=20)
    ax.scatter(samples_c1[i][:,0], samples_c1[i][:,1], \
           marker='o', color='green', s=40, alpha=0.5, label='$\omega_1$')
    ax.scatter(samples_c2[i][:,0], samples_c2[i][:,1], \
           marker='^', color='red', s=40, alpha=0.5, label='$\omega_2$')
    plt.title('%s bivariate random training samples per class' %i)
    plt.legend()

    # plot boundaries
    plt.plot(x_true50, y_true50, 'b--', lw=3, label='true param. boundary')
    plt.plot(x_est50, y_est50, 'k--', lw=3, label='MLE boundary')

    plt.legend(loc='lower left')
    plt.show()

Just wanted to post my tentative solution for now. But it's probably not optimal ...

def discr_func(x, y, cov_mat, mu_vec):
    """
    Calculates the value of the discriminant function for a dx1 dimensional
    sample given covariance matrix and mean vector.

    Keyword arguments:
        x_vec: A dx1 dimensional numpy array representing the sample.
        cov_mat: numpy array of the covariance matrix.
        mu_vec: dx1 dimensional numpy array of the sample mean.

    Returns a float value as result of the discriminant function.

    """
    x_vec = np.array([[x],[y]])

    W_i = (-1/2) * np.linalg.inv(cov_mat)
    assert(W_i.shape[0] > 1 and W_i.shape[1] > 1), 'W_i must be a matrix'

    w_i = np.linalg.inv(cov_mat).dot(mu_vec)
    assert(w_i.shape[0] > 1 and w_i.shape[1] == 1), 'w_i must be a column vector'

    omega_i_p1 = (((-1/2) * (mu_vec).T).dot(np.linalg.inv(cov_mat))).dot(mu_vec)
    omega_i_p2 = (-1/2) * np.log(np.linalg.det(cov_mat))
    omega_i = omega_i_p1 - omega_i_p2
    assert(omega_i.shape == (1, 1)), 'omega_i must be a scalar'

    g = ((x_vec.T).dot(W_i)).dot(x_vec) + (w_i.T).dot(x_vec) + omega_i
    return float(g)

#g1 = discr_func(x, y, cov_mat=cov_mat1, mu_vec=mu_vec_1)
#g2 = discr_func(x, y, cov_mat=cov_mat2, mu_vec=mu_vec_2)

x_est50 = list(np.arange(-6, 6, 0.1))
y_est50 = []
for i in x_est50:
    y_est50.append(scipy.optimize.bisect(lambda y: discr_func(i, y, cov_mat=cov_est_1, mu_vec=mu_est_1) - \
                      discr_func(i, y, cov_mat=cov_est_2, mu_vec=mu_est_2), -10,10))
y_est50 = [float(i) for i in y_est50]

Here is the result: (blue the quadratic case, red the linear case (equal variances) http://i.imgur.com/T16awxM.png?1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM