简体   繁体   English

创建多数组的任意视图

[英]Creating arbitrary views of multi array

I am writing a c++ function to compute marginal PDFs (Probability density functions). 我正在编写一个c ++函数来计算边际PDF(概率密度函数)。 This basically means that I get multi dimensional data (PDF) defined along a grid of a number of variables. 这基本上意味着我得到了沿着多个变量的网格定义的多维数据(PDF)。 I want to integrate the data over an undefined number of dimensions to keep the function general. 我想在不确定数量的维度上集成数据,以保持功能通用。

The dimension of the PDF can be arbitrary and the dimension of the marginal PDF can also be arbitrary. PDF的尺寸可以是任意的,边际PDF的尺寸也可以是任意的。 It is not possible to define the order of the dimensions of the input data, so I send a vector to the function, which says which variables need to be kept. 无法定义输入数据的维度顺序,因此我向函数发送了一个向量,该向量表示需要保留哪些变量。 The other variables need to be integrated. 其他变量需要集成。

So for example: No of variables: 5(a,b,c,d,e) , PDF data dimension 5, compute marginal PDF of (a,c,d). 因此,例如:变量数:5(a,b,c,d,e),PDF数据维度5,计算(a,c,d)的边际PDF。 This means the variables/dimensions 0,2,3 need to be kept and the others need to be integrated out (by a definite integral). 这意味着需要保留变量/维度0、2、3,而其他变量/维度必须进行积分(通过确定的整数)。 so: PDF[a][b][c][d][e] -> MARGPDF[a][c][d] (which contains other values) for every [a][c][d] I need to perform an action on the data in the other dimensions [b][e]. 因此:对于每个我需要的[a] [c] [d],PDF [a] [b] [c] [d] [e]-> MARGPDF [a] [c] [d](包含其他值)对其他维度[b] [e]中的数据执行操作。 I can do this by making a view, however I don't now how I can do this dynamically. 我可以通过做一个视图来做到这一点,但是现在我不知道如何动态地做到这一点。 By dynamic I mean that I want that the number of dimensions and which dimensions are kept are free to choose. 动态是指我希望维度的数量和保留的维度可以自由选择。

Basically, what I want is to create a view of all the values in the dimensions b and e and do this for each (loop) value of a,c,d. 基本上,我想要的是创建维度b和e中所有值的视图,并对a,c,d的每个(循环)值执行此操作。 However, I want the function to be general such that the input can be any multi array and the output variables will be free to choose. 但是,我希望函数具有通用性,以便输入可以是任何多数组,而输出变量可以自由选择。 So it could also be: PDF[a][b][c][d] -> MARGPDF[c] or PDF[a][b][c][d][e][f] -> MARGPDF[b][d]. 因此也可能是:PDF [a] [b] [c] [d]-> MARGPDF [c]或PDF [a] [b] [c] [d] [e] [f]-> MARGPDF [b ] [d]。

I've had the following idea: I sort the PDF multi array by dimension, such that I can create a view of the last number of dimensions, so: PDF[a][b][c][d][e] becomes PDF[a][c][d][b][e] . 我有以下想法:我按维度对PDF多数组进行排序,以便可以创建最后一个维度的视图,因此:PDF [a] [b] [c] [d] [e]变为PDF [a] [c] [d] [b] [e]。 Then I loop over each a,c,d and create a view of the remaining 2 dimensions b and e. 然后,我遍历每个a,c,d并创建其余2维b和e的视图。 I perform a calculation using this view and save the value to MARGPDF[a][c][d]. 我使用此视图执行计算,并将值保存到MARGPDF [a] [c] [d]。

What I need to know to perform such an operation is: How can I switch the order of the dimensions/indices of a boost::multi_array ? 我需要知道执行以下操作的方法:如何切换boost :: multi_array的尺寸/索引的顺序? How can I create a view when the dimensions are free? 尺寸自由时如何创建视图? Or do you have any other idea to accomplish the same thing? 还是您有其他想法来完成同一件事?

The start of my code is provided below: 下面提供了我的代码的开头:

template<class DataType, int Dimension, int ReducedDimension>
boost::multi_array<DataType, ReducedDimension> ComputeMarginalPDF(boost::multi_array<DataType, Dimension> PDF,
                                                           std::vector< std::vector<DataType> > Variables , std::vector<int> VarsToKeep ){
// check input dimensions
if (VarsToKeep.size() != ReducedDimension ){
    std::cout << "Dimensions do not match" << std::endl;
}

std::vector< std::vector<double> > NewVariables(0) ;

// Construct reduced array with proper dimensions
typedef boost::multi_array< DataType , ReducedDimension > ReducedArray ;
boost::array< ReducedArray::index , ReducedDimension > dimensions;

// get dimensions from array and insert into dimensions ;
// set Marginal PDF dimensions
for(int i = 0 ; i < VarsToKeep.size() ; i++){
    dimensions[i] = PDF.shape()[ VarsToKeep[i] ] ;
    NewVariables.push_back( Variables[ VarsToKeep[i] ] );
}

ReducedArray Marginal(dimensions) ;

// to be filled with code

I hope I am not to confusing. 我希望我不要混淆。 Any suggestions to improve the questions are welcome. 欢迎提出任何改进问题的建议。

I had a similar problem a few months ago, but i only had to calculate one-dimensional marginals. 几个月前我遇到了类似的问题,但是我只需要计算一维边际。 This is an outline of the solution that worked for me, i guess it can be adapted to multi-dimensional marginals as well: 这是对我有用的解决方案的概述,我想它也可以适用于多维边际:

I basically stored the pdf inside a one-dimensional array/vector (use whatever you like): 我基本上将pdf存储在一维数组/向量中(使用您喜欢的任何东西):

double* pdf = new double[a*b*c*d*e];

Then i used that you can store a two-dimensional array a[width][height] as a one-dimensional array b[widht*height] and access any element a[x][y] as b[width*x + y] . 然后我用过,您可以将二维数组a[width][height]为一维数组b[widht*height]并以b[width*x + y] b[widht*height]访问任何元素a[x][y] b[width*x + y] You can generalize this formula for arbitrary dimensions, and with a proper use of modulo/integer division you can also calculate the inverse. 您可以将该公式推广为任意维度,并且通过适当使用模/整数除法,您还可以计算逆。

Both the calculation from a one-dimensional index to a N-dimensional index and vice versa are pretty straightforward using templates. 使用模板,从一维索引到N维索引的计算以及反之亦然。 This allows you to convert your notation PDF[a][b][c][d][e] which is dependent on your dimension to notation like PDF(std::vector<size_t>{a,b,c,d,e}) which is easily expanded to arbitrary dimensions, since you can fill the vector in advance in a loop. 这样,您就可以将取决于尺寸的符号PDF[a][b][c][d][e]PDF(std::vector<size_t>{a,b,c,d,e})可以轻松扩展为任意尺寸,因为您可以预先在循环中填充矢量。

If you think this approach might help you, i can try to get hold of some key-functions of my implementation and add them here. 如果您认为这种方法可能对您有所帮助,我可以尝试掌握实现的一些关键功能并将其添加到此处。

Edit: 编辑:

template <size_t DIM>
inline void posToPosN(const size_t& pos,
    const size_t* const size,
    size_t* const posN){
    size_t r = pos;

    for (size_t i = DIM; i > 0; --i){
        posN[i - 1] = r % size[i - 1];
        r /= size[i - 1];
    }
}

template <size_t DIM>
inline void posNToPos(size_t& pos,
    const size_t* const size,
    const size_t* const posN){
    pos = 0;
    size_t mult = 1;

    for (size_t i = DIM; i > 0; --i){
        pos += mult * posN[i - 1];
        mult *= size[i - 1];
    }
}

template<typename type, size_t DIM>
class Iterator{
private:
    type* const _data; //pointer to start of Array
    size_t _pos; //1-dimensional position
    size_t _posN[DIM]; //n-dimensional position
    size_t const * const _size; //pointer to the _size-Member of Array
    size_t _total;

private:

public:
    Iterator(type* const data, const size_t* const size, size_t total, size_t pos)
        : _data(data), _pos(pos), _size(size), _total(total)
    {
        if (_pos > _total || _pos < 0) _pos = _total;
        posToPosN<DIM>(_pos, _size, _posN);
    }

    bool operator!= (const Iterator& other) const
    {
        return _pos != other._pos;
    }

    type& operator* () const{
        if (_pos >= _total)
            std::cout << "ERROR, dereferencing too high operator";
        return *(_data + _pos);
    }

    const Iterator& operator++ ()
    {
        ++_pos;
        if (_pos > _total) _pos = _total;

        posToPosN<DIM>(_pos, _size, _posN);
        return *this;
    }

    Iterator& operator +=(const size_t& b)
    {
        _pos += b;
        if (_pos > _total) _pos = _total;

        posToPosN<DIM>(_pos, _size, _posN);
        return *this;
    }

    const Iterator& operator-- ()
    {
        if (_pos == 0)
            _pos = _total;
        else
            --_pos;

        posToPosN<DIM>(_pos, _size, _posN);
        return *this;
    }

    //returns position in n-th dimension
    size_t operator[](size_t n){
        return _posN[n];
    }

    //returns a new iterator, advanced by n steps in the dim Dimension
    Iterator advance(size_t dim, int steps = 1){
        if (_posN[dim] + steps < 0 || _posN[dim] + steps >= _size[dim]){
            return Iterator(_data, _size, _total, _total);
        }

        size_t stride = 1;
        for (size_t i = DIM - 1; i > dim; --i){
            stride *= _size[i];
        }

        return Iterator(_data, _size, _total, _pos + steps*stride);
    }
};


template <typename type, size_t DIM>
class Array{
    type* _data;
    size_t _size[DIM];
    size_t _total;

    void init(const size_t* const dimensions){
        _total = 1;
        for (int i = 0; i < DIM; i++){
            _size[i] = dimensions[i];
            _total *= _size[i];
        }

        _data = new type[_total];
    }

public:
    Array(const size_t* const dimensions){
        init(dimensions);
    }

    Array(const std::array<size_t, DIM>& dimensions){
        init(&dimensions[0]);
    }

    ~Array(){
        delete _data;
    }
    Iterator<type, DIM> begin(){
        return Iterator<type, DIM>(_data, _size, _total, 0);
    }
    Iterator<type, DIM> end(){
        return Iterator<type, DIM>(_data, _size, _total, _total);
    }
    const size_t* const size(){
        return _size;
    }
};


//for projections of the PDF
void calc_marginals(size_t dir, double* p_xPos, double* p_yPos){
    assert(dir < N_THETA);

    std::lock_guard<std::mutex> lock(calcInProgress);

    //reset to 0
    for (size_t i = 0; i < _size[dir]; ++i){
        p_yPos[i] = 0;
    }

    //calc projection
    double sum = 0;
    for (auto it = _p_theta.begin(); it != _p_theta.end(); ++it){

        p_yPos[it[dir]] += (*it);
        sum += (*it);
    }

    if (abs(sum - 1) > 0.001){ cout << "Warning: marginal[" << dir << "] not normalized" << endl; }
    //calc x-Axis
    for (size_t i = 0; i < _size[dir]; ++i){
        p_xPos[i] = _p[dir].start + double(i) / double(_size[dir] - 1)*(_p[dir].stop - _p[dir].start);
    }
}

The code consists of several parts: 该代码由几部分组成:

  • Two functions posToPosN() and posNToPos() which do the mentioned transformation between one-dimensional and DIM-dimensional coordinates. 两个函数posToPosN()posNToPos()在一维和DIM维坐标之间进行上述转换。 Dimension is given here as template parameter DIM. 尺寸在此处作为模板参数DIM给出。 pos just the one-dimensional position, posN a pointer to an array of size DIM referring to the DIM-dimensional coordinates and size is an array of size DIM containig the width in the different directions (in your case something like {a,b,c,d,e}) pos只是一维位置, posN的指针大小的数组DIM参照DIM维坐标和size是大小的数组DIM containig在不同的方向上的宽度(在你的情况类似{A,B, C,d,E})
  • Iterator is a Iterator class, allowing for range-based or iterator-based for-loops over the DIM-Dimensional Array. Iterator是Iterator类,允许在DIM维数组上进行基于范围的迭代器或基于迭代器的for循环。 Note the operator[](size_t n) which returns the n-th component of the DIM-dimensional coordinates and the advance() function which returns an iterator to an element with coordinates {posN[0], posN[1], ...,posN[dim] + steps , ... posN[DIM]} 请注意, operator[](size_t n)返回DIM维坐标的第n个分量,而advance()函数将迭代器返回到坐标为{posN[0], posN[1], ...,posN[dim] + steps , ... posN[DIM]}的元素{posN[0], posN[1], ...,posN[dim] + steps , ... posN[DIM]}
  • Array should be pretty straighforward Array应该很简单
  • calcMarginals is the function i used for calculating marginals. calcMarginals是我用于计算边际的函数。 dir is here the direction that i want to calculate the marginal (remember: one-dimensional marginal) and write to p_xPos and p_yPos , _p_theta is an Array . dir是我要计算边际(记住:一维边际)并写入p_xPosp_yPos_p_theta是一个Array Note the iterator-based for-loop, (*it) refers here to the double-value of the pdf stored inside array, just as usual iterators do. 请注意,基于迭代器的for循环(*it)在这里是指数组中存储的pdf的双精度值,就像通常的迭代器一样。 In addition, it[dim] returns the coordinate of the actual value in direction dim. 另外, it[dim]返回方向暗淡的实际值的坐标。 The last loop to write to p_xPos is just because i dont want indices in this array but real values. 写入p_xPos的最后一个循环只是因为我不希望该数组中的索引而是实际值。

I guess if you redefine the Iterator::operator[] to take a vector/array of dimension-indices and return a vector/array of the appropriate coordinates and add an Array::operator[] for random access which takes a vector/array as well you should be pretty much done. 我猜想如果您重新定义Iterator::operator[]以获取维度索引的向量/数组,并返回具有适当坐标的向量/数组,并添加Array::operator[]进行随机访问,则向量/数组以及您应该已经完成​​了。

I fixed the issue. 我解决了这个问题。 I figured that I cannot create a boost::multi_array of arbitrary dimension, because it requires the dimension as template parameter, which needs to be known at compiler time. 我发现我无法创建任意尺寸的boost :: multi_array,因为它需要将尺寸作为模板参数,在编译时需要知道该参数。 This means that I cannot create views of an arbitrary dimension. 这意味着我无法创建任意尺寸的视图。

Therefore, I did the following: I sorted the PDF, such that the dimensions that will be integrated out are the last dimensions (most likely not the most efficient method). 因此,我进行了以下操作:对PDF进行了排序,以使要集成的尺寸为最后一个尺寸(很可能不是最有效的方法)。 Then I reduced the dimension of the PDF one by one. 然后我一一缩小了PDF的尺寸。 Each loop I integrated only 1 dimension out, which I saved in a multi_array that had the same size as the initial array(because I could not make the dimension dynamic). 每个循环仅集成1个维度,我将其保存在与初始数组大小相同的multi_array中(因为我无法使维度动态化)。 After that I copied the values to a multi_array of the reduced size(which was known). 之后,我将值复制到减小大小的multi_array(已知)。

I used the following link to loop, dimension independently, over the dimensions: 我使用以下链接在尺寸上独立循环,尺寸:

// Dimension-independent loop over boost::multi_array? // 通过boost :: multi_array进行尺寸无关的循环?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM