简体   繁体   English

如何加快在C ++中返回指向对象的指针的函数?

[英]How to speed up a function that returns a pointer to object in c++?

I am a mechanical engineer so please understand I am not trained in proper coding. 我是机械工程师,所以请理解我没有接受正确的编码培训。 I have a finite element code that uses grids to make elements which make a model. 我有一个有限元代码,它使用网格来制作构成模型的元素。 The element is not important to this question so I have left it out. 该元素对这个问题并不重要,因此我省略了。 The elements and grids are read in from a file and that part works. 元素和网格从文件中读取,并且该部分正常工作。

class Grid
{
private:
    int id;
    double x;
    double y;
    double z;
public:
    Grid();
    Grid(int, double, double, double);
    int get_id() { return id;};
};

Grid::Grid() {};
Grid::Grid(int t_id, double t_x, double t_y double t_z)
{
    id = t_id; x = t_x; y = t_y; z = t_z;
}

class SurfaceModel
{
private:
    Grid** grids;
    Element** elements;
    int grid_count;
    int elem_count;
public:
    SurfaceModel();
    SurfaceModel(int, int);
    ~SurfaceModel();
    void read_grid(std::string);
    int get_grid_count() { return grid_count; };
    Grid* get_grid(int);
};

SurfaceModel::SurfaceModel()
{
    grids = NULL;
    elements = NULL;
}

SurfaceModel::SurfaceModel(int g, int e)
{
    grids = new Grid*[g];
    for (int i = 0; i < g; i++)
        grids[i] = NULL;
    elements = new Element*[e];
    for (int i = 0; i < e; i++)
        elements[i] = NULL;
}

void SurfaceModel::read_grid(std::string line)
{
    ... blah blah ...
    grids[index] = new Grid(n_id, n_x, n_y, n_z);
    ... blah blah ....
}

Grid* SurfaceModel::get_grid(int i)
{
    if (i < grid_count)
        return grids[i];
    else
        return NULL;
}

When I need to actually use the grid I use the get_grid maybe something like this: 当我需要实际使用网格时,我使用get_grid可能是这样的:

SurfaceModel model(...);
.... blah blah ..... 
for (int i = 0; i < model.get_grid_count(); i++)
{
    Grid *cur_grid = model.get_grid(i);
    int cur_id = cur_grid->get_id();
}

My problem is that the call to get_grid seems to be taking more time than I think it should to simply return my object. 我的问题是,对get_grid的调用似乎比我认为简单地返回我的对象​​要花费更多的时间。 I have run the gprof on the code and found that get_grid gets called about 4 billion times when going through a very large simulation and another operation using the x, y, z occurs about the same. 我在代码上运行了gprof,发现在进行非常大的模拟时,get_grid被调用了约40亿次,并且使用x,y,z进行的另一操作大致相同。 The operation does some multiplication. 该运算会进行一些乘法。 What I found is that the get_grid and math take about the same amount of time (~40 seconds). 我发现get_grid和数学运算大约花费相同的时间(约40秒)。 This seems like I have done something wrong. 看来我做错了什么。 Is there a faster way to get that object out of there? 有没有更快的方法让那个物体离开那里?

I think you're forgetting to set grid_count and elem_count . 我认为您忘记设置grid_countelem_count

This means, they will have uninitialized ( indeterminate ) values. 这意味着它们将具有未初始化的( 不确定的 )值。 If you loop for those values, you can easily end up looping a lot of iterations. 如果循环使用这些值,则可以轻松地循环大量迭代。

SurfaceModel::SurfaceModel() 
   : grid_count(0), 
     grids(NULL),
     elem_count(0),
     elements(NULL)
{
}

SurfaceModel::SurfaceModel(int g, int e)
   : grid_count(g), 
     elem_count(e)
{
    grids = new Grid*[g];
    for (int i = 0; i < g; i++)
        grids[i] = NULL;
    elements = new Element*[e];
    for (int i = 0; i < e; i++)
        elements[i] = NULL;
}

Howeverm, I suggest you would want to get rid of each instance of new in this program (and use a vector for the grid) 然而,我建议您要摆脱该程序中new的每个实例(并为网格使用向量)

On a modern CPU accessing memory often takes longer than doing multiplication. 在现代CPU上访问内存通常比进行乘法需要更长的时间。 Getting good performance on modern systems can often mean focusing more on optimizing memory accesses than optimizing computation. 在现代系统上获得良好性能往往意味着更多地关注优化内存访问而不是优化计算。 Because you are storing your grid objects as an array of dynamically allocated pointers the grid objects themselves will be stored non-contiguously in memory and you will likely get many cache misses when trying to access them. 因为您将网格对象存储为动态分配的指针数组,所以网格对象本身将不连续存储在内存中,并且在尝试访问它们时可能会遇到许多缓存未命中的情况。 In this example you would probably see a significant speedup by storing your grid objects directly in an array or vector since you will be accessing contiguous memory in your loop and so get good cache utilization and effective hardware prefetching. 在此示例中,通过将网格对象直接存储在数组或向量中,您可能会看到明显的加速,因为您将访问循环中的连续内存,因此可以获得良好的缓存利用率和有效的硬件预取。

4 billion times a microsecond (which is a pretty acceptable time in many cases) gives 4 000 seconds. 一微秒40亿次(在许多情况下这是一个相当可接受的时间)可以提供4 000秒。 And since you only get about 40 s (if I get it right), I doubt there's something seriously wrong here. 而且由于您只能得到大约40秒(如果我做对了),因此我怀疑这里是否存在严重错误。 If it's still slow for the task, I'd consider the use of parallel computing. 如果任务仍然很慢,我会考虑使用并行计算。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM