简体   繁体   中英

OpenMP parallelization stopped working

On linux, AMD 8-core processor, using g++ 4 7.1.

This is - for me - a headbanger. This following code was working perfectly, and for some reason stopped parallelizing. I added the omp_get_num_procs(), and it prints 8 processors. I checked the compilaton, and -fopenmp is present as option both linking and compiling. No compilation/link error message. I checked if any environment variables were defined (OMP_xxx) - there were none.

Are there other - external - factors that could influence?

#pragma omp parallel
{
  lightray ray;
  rgba L;
  printf("Max nr processors: %d\n", omp_get_num_procs());

  #pragma omp for schedule(dynamic)
  for (int xy = 0; xy < xy_range; xy++) {
    int x = x_from + (xy % x_width);
    int y = y_from + (xy / x_width);
    ray = cam->get_ray_at(x, y);
    L = trace_ray(ray, 0, cam->inter);
    cam->set_pixel(x, y, L);
  }
}
dtime = omp_get_wtime() - dtime;
printf("time %f\n", dtime);
}

EDIT: I think I've found something here... The command line for g++ generated by Anjuta contains this:

-DPACKAGE_LOCALE_DIR=\""/usr/local/share/locale"\" -DPACKAGE_SRC_DIR=\"".. -fopenmp  . "\" 

The PACKAGE_SRC_DIR definition seems to 'include' the -fopenmp flag, which would hide it from g++. Haven't found the cause yet...

Try rewriting it this way:

lightray ray;
rgba L;
printf("Max nr processors: %d\n", omp_get_num_procs());

#pragma omp parallel for schedule(dynamic) private(ray,L)
for (int xy = 0; xy < xy_range; xy++) {
  int x = x_from + (xy % x_width);
  int y = y_from + (xy / x_width);
  ray = cam->get_ray_at(x, y);
  L = trace_ray(ray, 0, cam->inter);
  cam->set_pixel(x, y, L);
}
dtime = omp_get_wtime() - dtime;
printf("time %f\n", dtime);

That way you introduce ray and L as being variables specific to each of the threads tag-teaming the loop. Since variables defined outside of a parallel region are shared between threads by default, your current implementation is munging these two variables.

Also, omp_get_num_procs() "Returns the number of processors available to the program." according to the OpenMP API 3.1 C/C++ Syntax Quick Reference Card - it therefore does not necessarily tell you how many threads are actually being used in a region. For that you may want omp_get_num_threads() or omp_get_thread_num()

It seems to have been a problem external to the program. I did change IDE versions (Anjuta). Anjuta is very dependent on pkg-config. OpemMP doesn't have pkg-config .pc files, so I made one for the libgomp library. I added -lgomp to Libs: which went fine, and added -fopenmp to both Libs: and Cflags: which didn't go well.

For some reason, -fopenmp was added into a command line parameter called -DPACKAGE_SRC_DIR (inside its quoted value - see edit in original message) and as such was ignored by the linker and compiler. I'll ask about this on the Anjuta forum.

So, the solution was to remove it from the .pc file, and add it manually to the project parameters as 'CXXFLAGS=-fopenmp' 'LDFLAGS=-fopenmp' (I wanted to avoid this as surely next time I'll forget to do it :)

Anyway, it works like this. Thanks for the suggestions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM