[英]OpenMP parallelization stopped working
On linux, AMD 8-core processor, using g++ 4 7.1. 在Linux上,使用g ++ 4 7.1的AMD 8核处理器。
This is - for me - a headbanger. 对我来说,这是一个危险。 This following code was working perfectly, and for some reason stopped parallelizing.
以下代码运行良好,并且由于某种原因停止了并行化。 I added the omp_get_num_procs(), and it prints 8 processors.
我添加了omp_get_num_procs(),它可以打印8个处理器。 I checked the compilaton, and -fopenmp is present as option both linking and compiling.
我检查了编译,并且-fopenmp作为链接和编译的选项存在。 No compilation/link error message.
没有编译/链接错误消息。 I checked if any environment variables were defined (OMP_xxx) - there were none.
我检查是否定义了任何环境变量(OMP_xxx)-没有。
Are there other - external - factors that could influence? 还有其他可能影响外部的因素吗?
#pragma omp parallel
{
lightray ray;
rgba L;
printf("Max nr processors: %d\n", omp_get_num_procs());
#pragma omp for schedule(dynamic)
for (int xy = 0; xy < xy_range; xy++) {
int x = x_from + (xy % x_width);
int y = y_from + (xy / x_width);
ray = cam->get_ray_at(x, y);
L = trace_ray(ray, 0, cam->inter);
cam->set_pixel(x, y, L);
}
}
dtime = omp_get_wtime() - dtime;
printf("time %f\n", dtime);
}
EDIT: I think I've found something here... The command line for g++ generated by Anjuta contains this: 编辑:我想我已经在这里找到了一些... Anjuta生成的g ++命令行包含以下内容:
-DPACKAGE_LOCALE_DIR=\""/usr/local/share/locale"\" -DPACKAGE_SRC_DIR=\"".. -fopenmp . "\"
The PACKAGE_SRC_DIR definition seems to 'include' the -fopenmp flag, which would hide it from g++. PACKAGE_SRC_DIR定义似乎“包含”了-fopenmp标志,这会将其从g ++隐藏。 Haven't found the cause yet...
尚未找到原因...
Try rewriting it this way: 尝试以这种方式重写它:
lightray ray;
rgba L;
printf("Max nr processors: %d\n", omp_get_num_procs());
#pragma omp parallel for schedule(dynamic) private(ray,L)
for (int xy = 0; xy < xy_range; xy++) {
int x = x_from + (xy % x_width);
int y = y_from + (xy / x_width);
ray = cam->get_ray_at(x, y);
L = trace_ray(ray, 0, cam->inter);
cam->set_pixel(x, y, L);
}
dtime = omp_get_wtime() - dtime;
printf("time %f\n", dtime);
That way you introduce ray
and L
as being variables specific to each of the threads tag-teaming the loop. 这样,您将
ray
和L
引入为特定于每个线程的变量,从而对循环进行标记。 Since variables defined outside of a parallel region are shared between threads by default, your current implementation is munging these two variables. 由于默认情况下,在并行区域之外定义的变量在线程之间共享,因此您当前的实现方式是对这两个变量进行调整。
Also, omp_get_num_procs()
"Returns the number of processors available to the program." 另外,
omp_get_num_procs()
“返回该程序可用的处理器数量。” according to the OpenMP API 3.1 C/C++ Syntax Quick Reference Card - it therefore does not necessarily tell you how many threads are actually being used in a region. 根据OpenMP API 3.1 C / C ++语法快速参考卡-因此,它不一定告诉您一个区域中实际使用了多少个线程。 For that you may want
omp_get_num_threads()
or omp_get_thread_num()
为此,您可能需要
omp_get_num_threads()
或omp_get_thread_num()
It seems to have been a problem external to the program. 这似乎是程序外部的问题。 I did change IDE versions (Anjuta).
我确实更改了IDE版本(Anjuta)。 Anjuta is very dependent on pkg-config.
Anjuta非常依赖pkg-config。 OpemMP doesn't have pkg-config .pc files, so I made one for the libgomp library.
OpemMP没有pkg-config .pc文件,因此我为libgomp库制作了一个。 I added -lgomp to Libs: which went fine, and added -fopenmp to both Libs: and Cflags: which didn't go well.
我将-lgomp添加到了Libs:效果很好,并且将-fopenmp添加到了两个Libs:和Cflags:效果不佳。
For some reason, -fopenmp was added into a command line parameter called -DPACKAGE_SRC_DIR (inside its quoted value - see edit in original message) and as such was ignored by the linker and compiler. 由于某些原因,-fopenmp被添加到称为-DPACKAGE_SRC_DIR的命令行参数中(在其引号内-参见原始消息中的编辑),因此链接器和编译器将其忽略。 I'll ask about this on the Anjuta forum.
我会在Anjuta论坛上问这个问题。
So, the solution was to remove it from the .pc file, and add it manually to the project parameters as 'CXXFLAGS=-fopenmp' 'LDFLAGS=-fopenmp' (I wanted to avoid this as surely next time I'll forget to do it :) 因此,解决方案是将其从.pc文件中删除,然后将其手动添加为项目参数“ CXXFLAGS = -fopenmp”和“ LDFLAGS = -fopenmp”(我想避免这种情况,因为下次我肯定会忘记做:)
Anyway, it works like this. 无论如何,它是这样工作的。 Thanks for the suggestions.
感谢您的建议。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.