How to manipulate large Eigen matrices with functional programming in C++?

Question

When dealing with large data structure, I always prefer to pass a buffer by references in order to manipulate it from a function. However, in functional programming this is forbidden because of the use of pure functions.

How would it be possible to implement a function in C++ like

value(const Eigen::VectorXd& _input, Eigen::MatrixXd& _large_matrix_output);

with a functional programming style?.

In fact, the problem with implementing that as a pure function is the instantiation/allocation. The next example will allocate memory each time

Eigen::MatrixXd value(const Eigen::VectorXd& _input);

I thought in the following alternative. But will will require to reize the matrix each time the input changes.

class value{
  private:
   mutable Eigen::MatrixXd result;  
  public:
   Eigen::MatrixXd &operator()(const Eigen::VectorXd& _input) const{
     //... body
     return result;
   }
};

Answer 1

It seems like you are asking the impossible: How to access an existing buffer without passing that buffer.

There are a few options that go into that direction but I'm pretty sure you won't like them:

1 Just give up

Eigen::MatrixXd value(const Eigen::Ref<const Eigen::VectorXd>&) will cause a new allocation, which you may then copy into a larger allocation. So it will be slower than it needs to be (much so with very large allocations). But it will be clean, simple, and the code within the function will be well optimized.

The compiler will not optimize this allocation away. They are not magical.

2 Convert your function into a `CwiseNullaryOp` as described in the documentation

The idea is to return a custom Eigen expression that you can then assign to the large buffer. Especially since nullary ops basically wrap C++ lambdas (or functor structs) they allow a very functional coding style.

However, they also take away much of Eigen's optimization potential because they just compute a single scalar per call, not a CPU vector package. Basically, at this point we rely on the compiler optimizations and whatever can be optimized within a single evaluation.

Something like this should work:

inline auto value(const Eigen::Ref<const Eigen::VectorXd>& vec)
{
    Eigen::Index rows = vec.size(), cols = vec.size();
    return Eigen::MatrixXd::NullaryExpr(rows, cols,
          [=](Eigen::Index row, Eigen::Index col) -> double {
              return vec[row] * vec[col];
    });
}

Eigen::MatrixXd buf(...);
buf.middleRows(...) = value(...);
buf = value(...).middleRows(...);

Note that this type of code makes it very easy to rely on dangling pointers. That lambda is still referencing the passed vector. Better hope you finish using that new expression before the vector is deallocated. This is especially concerning with Eigen::Ref because that can contain a temporary allocation, for example if you pass a row of a column-major matrix.

Better write it like this:

template<class Derived>
auto value(const Eigen::MatrixBase<Derived>& vec)
{
    Eigen::Index rows = vec.size(), cols = vec.size();
    return Eigen::MatrixXd::NullaryExpr(rows, cols,
          [=](Eigen::Index row, Eigen::Index col) -> double {
              return vec[row] * vec[col];
    });
}

3 Return Eigen expressions

You can improve on version 2 by writing fully fledged Eigen expressions of your own . But writing them with vectorization is basically undocumented. You have to inspect the source code of Eigen and follow their pattern.

An easier approach is to rely on the return value deduction I also used above and simply return Eigen's expression.

template<class Derived>
auto value(const Eigen::MatrixBase<Derived>& vec)
{ return vec * vec.transpose(); }

buf.noalias() = value(...);

Note that this also does not allocate a matrix and evaluate the result into it. It returns an expression that can be evaluated into a matrix. So you better get the lifetime of of your objects right.

For example, if you cannot spot what is wrong with the following code, this style of programming is not for you.

template<class Derived>
auto value(const Eigen::MatrixBase<Derived>& vec)
{
    Eigen::RowVector3d tmp(1., 2., 3.);
    return vec * tmp;
}

Final thoughts

All of this may make your function interfaces look nicer (or worse). As far as writing pure functions goes, it just -- at best -- moves the ugly parts into Eigen. It doesn't eradicate them.

From a practical and performance point of view, the original version, where you passed the output buffer into the function is still usually the best unless you want to do crazy complicated mixing and matching between your matrix expressions.

void value(const Eigen::Ref<const Eigen::VectorXd>& in, Eigen::Ref<Eigen::MatrixXd> out);

How to manipulate large Eigen matrices with functional programming in C++?

Question

1 answers

solution1
2 ACCPTED 2022-06-13 07:18:16

1 Just give up

2 Convert your function into a `CwiseNullaryOp` as described in the documentation

3 Return Eigen expressions

Final thoughts

How to manipulate large Eigen matrices with functional programming in C++?

Question

1 answers

solution1 2 ACCPTED 2022-06-13 07:18:16

1 Just give up

2 Convert your function into a CwiseNullaryOp as described in the documentation

3 Return Eigen expressions

Final thoughts

solution1
2 ACCPTED 2022-06-13 07:18:16

2 Convert your function into a `CwiseNullaryOp` as described in the documentation