简体   繁体   English

寻找优化此代码的方法

[英]Looking for ways to optimize this code

The following code is part of an edge detect program: 以下代码是边缘检测程序的一部分:

void detect_optimized(int width, int height, int threshold)
{
    int x, y;
    int tmp;`
    int w = width--;
    int h = height--;

for (y = 1; y < w; y++)
    for (x = 1; x < h; x++)
    {
        tmp = mask_product(mask,a,x,y,0);
        if (tmp>255)
            tmp = 255;
        if (tmp<threshold)
            tmp = 0;
        c[x][y][0] = 255-tmp;

        tmp = mask_product(mask,a,x,y,1);
        if (tmp>255)
            tmp = 255;
        if (tmp<threshold)
            tmp = 0;
        c[x][y][1] = 255-tmp;

        tmp = mask_product(mask,a,x,y,2);
        if (tmp>255)
            tmp = 255;
        if (tmp<threshold)
            tmp = 0;
        c[x][y][2] = 255-tmp;
    }
}

I have been trying to implement blocking with the following code but I am not sure how to use it in this case. 我一直在尝试使用以下代码实现阻止,但不确定在这种情况下如何使用它。

You can swap the loops to get a better cache utilization. 您可以交换循环以提高缓存利用率。 This should speedup your code significantly (especially for large data). 这将大大提高您的代码速度(尤其是对于大数据)。

for (x = 1; x < h; x++)
    for (y = 1; y < w; y++)

Another substantial benefit can be reached by distributing the loop iterations over multiple threads to exploit multicore architectures. 通过将循环迭代分布在多个线程上以利用多核体系结构,可以达到另一个显着的好处。 Using OpenMP this is can be reached with a single compiler directive as follows. 使用OpenMP ,可以通过以下单个编译器指令来实现。

#pragma omp parallel for private(y, tmp)
for (x = 1; x < h; x++)
    for (y = 1; y < w; y++)

Other optimizations are usually done by the compiler. 其他优化通常由编译器完成。 Make sure to use appropriate compiler flags like -O2 and don't bother with low level adaptation yourself. 确保使用适当的编译器标志,例如-O2 ,不要自己为低级适应性打扰。

Offer the following candidate 提供以下候选人

  1. Avoid if() s at the price of * . 避免以*为代价使用if() Various pipelined platforms will benefit. 各种流水线平台将受益。
  2. Swap x,y order 交换x,y订单
  3. Decrement so end of loop test is against 0. 递减,以便循环测试结束时为0。
  4. Avoid recomputing c[x][y] 避免重新计算c[x][y]

Assume need to go though all colors. 假设需要通过所有颜色。

Of course, YMMV. 当然是YMMV。

for (x = h-1; x > 0; x--) {
  byte *p = &c[x][w-1][NUM_COLORS-1];

  for (y = w-1; y > 0; y--) {
    for (int z = NUM_COLORS-1; z >= 0; z--) {
      int tmp = mask_product(mask,a,x,y,z);
      *p = (255 - tmp*(tmp>=threshold))*(tmp <=255);
      p--;
     }
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM