简体   繁体   中英

How to use the Eigen unsupported levenberg marquardt implementation?

I'm trying to minimize a following sample function:

F(x) = f[0]^2(x[0],...,x[n-1]) + ... + f[m-1]^2(x[0],...,x[n-1])

A normal way to minimize such a funct could be the Levenberg-Marquardt algorithm. I would like to perform this minimization in c++ and have done some initial tests with Eigen that resulted in the expected solution.

My question is the following: I'm used to optimization in python using ie scipy.optimize.fmin_powell . Here the input function parameters are (func, x0, args=(), xtol=0.0001, ftol=0.0001, maxiter=None, maxfun=None, full_output=0, disp=1, retall=0, callback=None, direc=None) . So I can define a func(x0) , give the x0 vector and start optimizing. If needed I can change the optimization parameters.

Now the Eigen Lev-Marq algorithm works in a different way. I need to define a function vector (why?) Furthermore I can't manage to set the optimization parameters. According to:
http://eigen.tuxfamily.org/dox/unsupported/classEigen_1_1LevenbergMarquardt.html
I should be able to use the setEpsilon() and other set functions.

But when I have the following code:

my_functor functor;
Eigen::NumericalDiff<my_functor> numDiff(functor);
Eigen::LevenbergMarquardt<Eigen::NumericalDiff<my_functor>,double> lm(numDiff);
lm.setEpsilon(); //doesn't exist!

So I have 2 questions:

  1. Why is a function vector needed and why wouldn't a function scalar be enough?
    References where I've searched for an answer:
    http://www.ultimatepp.org/reference$Eigen_demo$en-us.html
    http://www.alglib.net/optimization/levenbergmarquardt.php

  2. How do I set the optimization parameters using the set functions?

So I believe I've found the answers.

1) The function is able to work as a function vector and as a function scalar.
If there are m solveable parameters, a Jacobian matrix of mxm needs to be created or numerically calculated. In order to do a Matrix-Vector multiplication J(x[m]).transpose*f(x[m]) the function vector f(x) should have m items. This can be the m different functions, but we can also give f1 the complete function and make the other items 0 .

2) The parameters can be set and read using lm.parameters.maxfev = 2000;

Both answers have been tested in the following example code:

#include <iostream>
#include <Eigen/Dense>

#include <unsupported/Eigen/NonLinearOptimization>
#include <unsupported/Eigen/NumericalDiff>

// Generic functor
template<typename _Scalar, int NX = Eigen::Dynamic, int NY = Eigen::Dynamic>
struct Functor
{
typedef _Scalar Scalar;
enum {
    InputsAtCompileTime = NX,
    ValuesAtCompileTime = NY
};
typedef Eigen::Matrix<Scalar,InputsAtCompileTime,1> InputType;
typedef Eigen::Matrix<Scalar,ValuesAtCompileTime,1> ValueType;
typedef Eigen::Matrix<Scalar,ValuesAtCompileTime,InputsAtCompileTime> JacobianType;

int m_inputs, m_values;

Functor() : m_inputs(InputsAtCompileTime), m_values(ValuesAtCompileTime) {}
Functor(int inputs, int values) : m_inputs(inputs), m_values(values) {}

int inputs() const { return m_inputs; }
int values() const { return m_values; }

};

struct my_functor : Functor<double>
{
my_functor(void): Functor<double>(2,2) {}
int operator()(const Eigen::VectorXd &x, Eigen::VectorXd &fvec) const
{
    // Implement y = 10*(x0+3)^2 + (x1-5)^2
    fvec(0) = 10.0*pow(x(0)+3.0,2) +  pow(x(1)-5.0,2);
    fvec(1) = 0;

    return 0;
}
};


int main(int argc, char *argv[])
{
Eigen::VectorXd x(2);
x(0) = 2.0;
x(1) = 3.0;
std::cout << "x: " << x << std::endl;

my_functor functor;
Eigen::NumericalDiff<my_functor> numDiff(functor);
Eigen::LevenbergMarquardt<Eigen::NumericalDiff<my_functor>,double> lm(numDiff);
lm.parameters.maxfev = 2000;
lm.parameters.xtol = 1.0e-10;
std::cout << lm.parameters.maxfev << std::endl;

int ret = lm.minimize(x);
std::cout << lm.iter << std::endl;
std::cout << ret << std::endl;

std::cout << "x that minimizes the function: " << x << std::endl;

std::cout << "press [ENTER] to continue " << std::endl;
std::cin.get();
return 0;
}

This answer is an extension of two existing answers: 1) I adapted the source code provided by @Deepfreeze to include additional comments and two different test functions. 2) I use the suggestion from @user3361661 to rewrite the objective function in the correct form. As he suggested, it reduced the iteration count on my first test problem from 67 to 4.

#include <iostream>
#include <Eigen/Dense>

#include <unsupported/Eigen/NonLinearOptimization>
#include <unsupported/Eigen/NumericalDiff>

/***********************************************************************************************/

// Generic functor
// See http://eigen.tuxfamily.org/index.php?title=Functors
// C++ version of a function pointer that stores meta-data about the function
template<typename _Scalar, int NX = Eigen::Dynamic, int NY = Eigen::Dynamic>
struct Functor
{

  // Information that tells the caller the numeric type (eg. double) and size (input / output dim)
  typedef _Scalar Scalar;
  enum { // Required by numerical differentiation module
      InputsAtCompileTime = NX,
      ValuesAtCompileTime = NY
  };

  // Tell the caller the matrix sizes associated with the input, output, and jacobian
  typedef Eigen::Matrix<Scalar,InputsAtCompileTime,1> InputType;
  typedef Eigen::Matrix<Scalar,ValuesAtCompileTime,1> ValueType;
  typedef Eigen::Matrix<Scalar,ValuesAtCompileTime,InputsAtCompileTime> JacobianType;

  // Local copy of the number of inputs
  int m_inputs, m_values;

  // Two constructors:
  Functor() : m_inputs(InputsAtCompileTime), m_values(ValuesAtCompileTime) {}
  Functor(int inputs, int values) : m_inputs(inputs), m_values(values) {}

  // Get methods for users to determine function input and output dimensions
  int inputs() const { return m_inputs; }
  int values() const { return m_values; }

};

/***********************************************************************************************/

// https://en.wikipedia.org/wiki/Test_functions_for_optimization
// Booth Function
// Implement f(x,y) = (x + 2*y -7)^2 + (2*x + y - 5)^2
struct BoothFunctor : Functor<double>
{
  // Simple constructor
  BoothFunctor(): Functor<double>(2,2) {}

  // Implementation of the objective function
  int operator()(const Eigen::VectorXd &z, Eigen::VectorXd &fvec) const {
    double x = z(0);   double y = z(1);
    /*
     * Evaluate the Booth function.
     * Important: LevenbergMarquardt is designed to work with objective functions that are a sum
     * of squared terms. The algorithm takes this into account: do not do it yourself.
     * In other words: objFun = sum(fvec(i)^2)
     */
    fvec(0) = x + 2*y - 7;
    fvec(1) = 2*x + y - 5;
    return 0;
  }
};

/***********************************************************************************************/

// https://en.wikipedia.org/wiki/Test_functions_for_optimization
// Himmelblau's Function
// Implement f(x,y) = (x^2 + y - 11)^2 + (x + y^2 - 7)^2
struct HimmelblauFunctor : Functor<double>
{
  // Simple constructor
  HimmelblauFunctor(): Functor<double>(2,2) {}

  // Implementation of the objective function
  int operator()(const Eigen::VectorXd &z, Eigen::VectorXd &fvec) const {
    double x = z(0);   double y = z(1);
    /*
     * Evaluate Himmelblau's function.
     * Important: LevenbergMarquardt is designed to work with objective functions that are a sum
     * of squared terms. The algorithm takes this into account: do not do it yourself.
     * In other words: objFun = sum(fvec(i)^2)
     */
    fvec(0) = x * x + y - 11;
    fvec(1) = x + y * y - 7;
    return 0;
  }
};

/***********************************************************************************************/

void testBoothFun() {
  std::cout << "Testing the Booth function..." << std::endl;
  Eigen::VectorXd zInit(2); zInit << 1.87, 2.032;
  std::cout << "zInit: " << zInit.transpose() << std::endl;
  Eigen::VectorXd zSoln(2); zSoln << 1.0, 3.0;
  std::cout << "zSoln: " << zSoln.transpose() << std::endl;

  BoothFunctor functor;
  Eigen::NumericalDiff<BoothFunctor> numDiff(functor);
  Eigen::LevenbergMarquardt<Eigen::NumericalDiff<BoothFunctor>,double> lm(numDiff);
  lm.parameters.maxfev = 1000;
  lm.parameters.xtol = 1.0e-10;
  std::cout << "max fun eval: " << lm.parameters.maxfev << std::endl;
  std::cout << "x tol: " << lm.parameters.xtol << std::endl;

  Eigen::VectorXd z = zInit;
  int ret = lm.minimize(z);
  std::cout << "iter count: " << lm.iter << std::endl;
  std::cout << "return status: " << ret << std::endl;
  std::cout << "zSolver: " << z.transpose() << std::endl;
  std::cout << "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" << std::endl;
}

/***********************************************************************************************/

void testHimmelblauFun() {
  std::cout << "Testing the Himmelblau function..." << std::endl;
  // Eigen::VectorXd zInit(2); zInit << 0.0, 0.0;  // soln 1
  // Eigen::VectorXd zInit(2); zInit << -1, 1;  // soln 2
  // Eigen::VectorXd zInit(2); zInit << -1, -1;  // soln 3
  Eigen::VectorXd zInit(2); zInit << 1, -1;  // soln 4
  std::cout << "zInit: " << zInit.transpose() << std::endl;
  std::cout << "soln 1: [3.0, 2.0]" << std::endl;
  std::cout << "soln 2: [-2.805118, 3.131312]" << std::endl;
  std::cout << "soln 3: [-3.77931, -3.28316]" << std::endl;
  std::cout << "soln 4: [3.584428, -1.848126]" << std::endl;

  HimmelblauFunctor functor;
  Eigen::NumericalDiff<HimmelblauFunctor> numDiff(functor);
  Eigen::LevenbergMarquardt<Eigen::NumericalDiff<HimmelblauFunctor>,double> lm(numDiff);
  lm.parameters.maxfev = 1000;
  lm.parameters.xtol = 1.0e-10;
  std::cout << "max fun eval: " << lm.parameters.maxfev << std::endl;
  std::cout << "x tol: " << lm.parameters.xtol << std::endl;

  Eigen::VectorXd z = zInit;
  int ret = lm.minimize(z);
  std::cout << "iter count: " << lm.iter << std::endl;
  std::cout << "return status: " << ret << std::endl;
  std::cout << "zSolver: " << z.transpose() << std::endl;
  std::cout << "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" << std::endl;
}

/***********************************************************************************************/

int main(int argc, char *argv[])
{

std::cout << "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" << std::endl;
testBoothFun();
testHimmelblauFun();
return 0;
}

The output at the command line from running this test script is:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Testing the Booth function...
zInit:  1.87 2.032
zSoln: 1 3
max fun eval: 1000
x tol: 1e-10
iter count: 4
return status: 2
zSolver: 1 3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Testing the Himmelblau function...
zInit:  1 -1
soln 1: [3.0, 2.0]
soln 2: [-2.805118, 3.131312]
soln 3: [-3.77931, -3.28316]
soln 4: [3.584428, -1.848126]
max fun eval: 1000
x tol: 1e-10
iter count: 8
return status: 2
zSolver:  3.58443 -1.84813
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

As an alternative you may simply create a new functor like this,

struct my_functor_w_df : Eigen::NumericalDiff<my_functor> {};

and then initialize the LevenbergMarquardt instance using like this,

my_functor_w_df functor;
Eigen::LevenbergMarquardt<my_functor_w_df> lm(functor);

Personally, I find this approach a bit cleaner.

It seems that the function is more general:

  1. Let's say you have an m parameter model.
  2. You have n observations to which you want to fit the m-parameter model in a least-squares sense.
  3. The Jacobian, if provided, will be n times m.

You will need to supply n error values in the fvec. Also, there is no need to square the f-values because it is implicitly assumed that the overall error function is made up of the sum of squares of the fvec components.

So, if you follow these guidelines and change the code to:

fvec(0) = sqrt(10.0)*(x(0)+3.0);
fvec(1) = x(1)-5.0;

It will converge in a ridiculously small number of iterations - like less than 5. I also tried it on a more complex example - the Hahn1 benchmark at http://www.itl.nist.gov/div898/strd/nls/data/hahn1.shtml with m=7 parameters and n=236 observations and it converges to the known right solution in only 11 iterations with the numerically computed Jacobian.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM