简体   繁体   中英

Creating arbitrary views of multi array

I am writing a c++ function to compute marginal PDFs (Probability density functions). This basically means that I get multi dimensional data (PDF) defined along a grid of a number of variables. I want to integrate the data over an undefined number of dimensions to keep the function general.

The dimension of the PDF can be arbitrary and the dimension of the marginal PDF can also be arbitrary. It is not possible to define the order of the dimensions of the input data, so I send a vector to the function, which says which variables need to be kept. The other variables need to be integrated.

So for example: No of variables: 5(a,b,c,d,e) , PDF data dimension 5, compute marginal PDF of (a,c,d). This means the variables/dimensions 0,2,3 need to be kept and the others need to be integrated out (by a definite integral). so: PDF[a][b][c][d][e] -> MARGPDF[a][c][d] (which contains other values) for every [a][c][d] I need to perform an action on the data in the other dimensions [b][e]. I can do this by making a view, however I don't now how I can do this dynamically. By dynamic I mean that I want that the number of dimensions and which dimensions are kept are free to choose.

Basically, what I want is to create a view of all the values in the dimensions b and e and do this for each (loop) value of a,c,d. However, I want the function to be general such that the input can be any multi array and the output variables will be free to choose. So it could also be: PDF[a][b][c][d] -> MARGPDF[c] or PDF[a][b][c][d][e][f] -> MARGPDF[b][d].

I've had the following idea: I sort the PDF multi array by dimension, such that I can create a view of the last number of dimensions, so: PDF[a][b][c][d][e] becomes PDF[a][c][d][b][e] . Then I loop over each a,c,d and create a view of the remaining 2 dimensions b and e. I perform a calculation using this view and save the value to MARGPDF[a][c][d].

What I need to know to perform such an operation is: How can I switch the order of the dimensions/indices of a boost::multi_array ? How can I create a view when the dimensions are free? Or do you have any other idea to accomplish the same thing?

The start of my code is provided below:

template<class DataType, int Dimension, int ReducedDimension>
boost::multi_array<DataType, ReducedDimension> ComputeMarginalPDF(boost::multi_array<DataType, Dimension> PDF,
                                                           std::vector< std::vector<DataType> > Variables , std::vector<int> VarsToKeep ){
// check input dimensions
if (VarsToKeep.size() != ReducedDimension ){
    std::cout << "Dimensions do not match" << std::endl;
}

std::vector< std::vector<double> > NewVariables(0) ;

// Construct reduced array with proper dimensions
typedef boost::multi_array< DataType , ReducedDimension > ReducedArray ;
boost::array< ReducedArray::index , ReducedDimension > dimensions;

// get dimensions from array and insert into dimensions ;
// set Marginal PDF dimensions
for(int i = 0 ; i < VarsToKeep.size() ; i++){
    dimensions[i] = PDF.shape()[ VarsToKeep[i] ] ;
    NewVariables.push_back( Variables[ VarsToKeep[i] ] );
}

ReducedArray Marginal(dimensions) ;

// to be filled with code

I hope I am not to confusing. Any suggestions to improve the questions are welcome.

I had a similar problem a few months ago, but i only had to calculate one-dimensional marginals. This is an outline of the solution that worked for me, i guess it can be adapted to multi-dimensional marginals as well:

I basically stored the pdf inside a one-dimensional array/vector (use whatever you like):

double* pdf = new double[a*b*c*d*e];

Then i used that you can store a two-dimensional array a[width][height] as a one-dimensional array b[widht*height] and access any element a[x][y] as b[width*x + y] . You can generalize this formula for arbitrary dimensions, and with a proper use of modulo/integer division you can also calculate the inverse.

Both the calculation from a one-dimensional index to a N-dimensional index and vice versa are pretty straightforward using templates. This allows you to convert your notation PDF[a][b][c][d][e] which is dependent on your dimension to notation like PDF(std::vector<size_t>{a,b,c,d,e}) which is easily expanded to arbitrary dimensions, since you can fill the vector in advance in a loop.

If you think this approach might help you, i can try to get hold of some key-functions of my implementation and add them here.

Edit:

template <size_t DIM>
inline void posToPosN(const size_t& pos,
    const size_t* const size,
    size_t* const posN){
    size_t r = pos;

    for (size_t i = DIM; i > 0; --i){
        posN[i - 1] = r % size[i - 1];
        r /= size[i - 1];
    }
}

template <size_t DIM>
inline void posNToPos(size_t& pos,
    const size_t* const size,
    const size_t* const posN){
    pos = 0;
    size_t mult = 1;

    for (size_t i = DIM; i > 0; --i){
        pos += mult * posN[i - 1];
        mult *= size[i - 1];
    }
}

template<typename type, size_t DIM>
class Iterator{
private:
    type* const _data; //pointer to start of Array
    size_t _pos; //1-dimensional position
    size_t _posN[DIM]; //n-dimensional position
    size_t const * const _size; //pointer to the _size-Member of Array
    size_t _total;

private:

public:
    Iterator(type* const data, const size_t* const size, size_t total, size_t pos)
        : _data(data), _pos(pos), _size(size), _total(total)
    {
        if (_pos > _total || _pos < 0) _pos = _total;
        posToPosN<DIM>(_pos, _size, _posN);
    }

    bool operator!= (const Iterator& other) const
    {
        return _pos != other._pos;
    }

    type& operator* () const{
        if (_pos >= _total)
            std::cout << "ERROR, dereferencing too high operator";
        return *(_data + _pos);
    }

    const Iterator& operator++ ()
    {
        ++_pos;
        if (_pos > _total) _pos = _total;

        posToPosN<DIM>(_pos, _size, _posN);
        return *this;
    }

    Iterator& operator +=(const size_t& b)
    {
        _pos += b;
        if (_pos > _total) _pos = _total;

        posToPosN<DIM>(_pos, _size, _posN);
        return *this;
    }

    const Iterator& operator-- ()
    {
        if (_pos == 0)
            _pos = _total;
        else
            --_pos;

        posToPosN<DIM>(_pos, _size, _posN);
        return *this;
    }

    //returns position in n-th dimension
    size_t operator[](size_t n){
        return _posN[n];
    }

    //returns a new iterator, advanced by n steps in the dim Dimension
    Iterator advance(size_t dim, int steps = 1){
        if (_posN[dim] + steps < 0 || _posN[dim] + steps >= _size[dim]){
            return Iterator(_data, _size, _total, _total);
        }

        size_t stride = 1;
        for (size_t i = DIM - 1; i > dim; --i){
            stride *= _size[i];
        }

        return Iterator(_data, _size, _total, _pos + steps*stride);
    }
};


template <typename type, size_t DIM>
class Array{
    type* _data;
    size_t _size[DIM];
    size_t _total;

    void init(const size_t* const dimensions){
        _total = 1;
        for (int i = 0; i < DIM; i++){
            _size[i] = dimensions[i];
            _total *= _size[i];
        }

        _data = new type[_total];
    }

public:
    Array(const size_t* const dimensions){
        init(dimensions);
    }

    Array(const std::array<size_t, DIM>& dimensions){
        init(&dimensions[0]);
    }

    ~Array(){
        delete _data;
    }
    Iterator<type, DIM> begin(){
        return Iterator<type, DIM>(_data, _size, _total, 0);
    }
    Iterator<type, DIM> end(){
        return Iterator<type, DIM>(_data, _size, _total, _total);
    }
    const size_t* const size(){
        return _size;
    }
};


//for projections of the PDF
void calc_marginals(size_t dir, double* p_xPos, double* p_yPos){
    assert(dir < N_THETA);

    std::lock_guard<std::mutex> lock(calcInProgress);

    //reset to 0
    for (size_t i = 0; i < _size[dir]; ++i){
        p_yPos[i] = 0;
    }

    //calc projection
    double sum = 0;
    for (auto it = _p_theta.begin(); it != _p_theta.end(); ++it){

        p_yPos[it[dir]] += (*it);
        sum += (*it);
    }

    if (abs(sum - 1) > 0.001){ cout << "Warning: marginal[" << dir << "] not normalized" << endl; }
    //calc x-Axis
    for (size_t i = 0; i < _size[dir]; ++i){
        p_xPos[i] = _p[dir].start + double(i) / double(_size[dir] - 1)*(_p[dir].stop - _p[dir].start);
    }
}

The code consists of several parts:

  • Two functions posToPosN() and posNToPos() which do the mentioned transformation between one-dimensional and DIM-dimensional coordinates. Dimension is given here as template parameter DIM. pos just the one-dimensional position, posN a pointer to an array of size DIM referring to the DIM-dimensional coordinates and size is an array of size DIM containig the width in the different directions (in your case something like {a,b,c,d,e})
  • Iterator is a Iterator class, allowing for range-based or iterator-based for-loops over the DIM-Dimensional Array. Note the operator[](size_t n) which returns the n-th component of the DIM-dimensional coordinates and the advance() function which returns an iterator to an element with coordinates {posN[0], posN[1], ...,posN[dim] + steps , ... posN[DIM]}
  • Array should be pretty straighforward
  • calcMarginals is the function i used for calculating marginals. dir is here the direction that i want to calculate the marginal (remember: one-dimensional marginal) and write to p_xPos and p_yPos , _p_theta is an Array . Note the iterator-based for-loop, (*it) refers here to the double-value of the pdf stored inside array, just as usual iterators do. In addition, it[dim] returns the coordinate of the actual value in direction dim. The last loop to write to p_xPos is just because i dont want indices in this array but real values.

I guess if you redefine the Iterator::operator[] to take a vector/array of dimension-indices and return a vector/array of the appropriate coordinates and add an Array::operator[] for random access which takes a vector/array as well you should be pretty much done.

I fixed the issue. I figured that I cannot create a boost::multi_array of arbitrary dimension, because it requires the dimension as template parameter, which needs to be known at compiler time. This means that I cannot create views of an arbitrary dimension.

Therefore, I did the following: I sorted the PDF, such that the dimensions that will be integrated out are the last dimensions (most likely not the most efficient method). Then I reduced the dimension of the PDF one by one. Each loop I integrated only 1 dimension out, which I saved in a multi_array that had the same size as the initial array(because I could not make the dimension dynamic). After that I copied the values to a multi_array of the reduced size(which was known).

I used the following link to loop, dimension independently, over the dimensions:

// Dimension-independent loop over boost::multi_array?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM