[英]How to write to file from different threads, OpenMP, C++
I use openMP for parallel my C++ program. 我使用openMP并行执行我的C ++程序。 My parallel code have very simple form
我的并行代码格式很简单
#pragma omp parallel for shared(a, b, c) private(i, result)
for (i = 0; i < N; i++){
result= F(a,b,c,i)//do some calculation
cout<<i<<" "<<result<<endl;
}
If two threads try to write into file simultaneously, the data is mixed up. 如果两个线程试图同时写入文件,则数据会混合在一起。 How I can solve this problem?
我该如何解决这个问题?
OpenMP provides pragmas to help with synchronisation. OpenMP提供了实用程序来帮助进行同步。
#pragma omp critical
allows only one thread to be executing the attached statement at any time (a mutual exclusion critical region). #pragma omp critical
允许任何时刻(互斥关键区域)仅一个线程执行附加的语句。 The #pragma omp ordered
pragma ensures loop iteration threads enter the region in order. #pragma omp ordered
pragma确保循环迭代线程按顺序进入该区域。
// g++ -std=c++11 -Wall -Wextra -pedantic -fopenmp critical.cpp
#include <iostream>
int main()
{
#pragma omp parallel for
for (int i = 0; i < 20; ++i)
std::cout << "unsynchronized(" << i << ") ";
std::cout << std::endl;
#pragma omp parallel for
for (int i = 0; i < 20; ++i)
#pragma omp critical
std::cout << "critical(" << i << ") ";
std::cout << std::endl;
#pragma omp parallel for ordered
for (int i = 0; i < 20; ++i)
#pragma omp ordered
std::cout << "ordered(" << i << ") ";
std::cout << std::endl;
return 0;
}
Example output (different each time in general): 输出示例(通常每次都不同):
unsynchronized(unsynchronized(unsynchronized(05) unsynchronized() 6unsynchronized() 1unsynchronized(7) ) unsynchronized(unsynchronized(28) ) unsynchronized(unsynchronized(93) ) unsynchronized(4) 10) unsynchronized(11) unsynchronized(12) unsynchronized(15) unsynchronized(16unsynchronized() 13unsynchronized() 17) unsynchronized(unsynchronized(18) 14unsynchronized() 19)
critical(5) critical(0) critical(6) critical(15) critical(1) critical(10) critical(7) critical(16) critical(2) critical(8) critical(17) critical(3) critical(9) critical(18) critical(11) critical(4) critical(19) critical(12) critical(13) critical(14)
ordered(0) ordered(1) ordered(2) ordered(3) ordered(4) ordered(5) ordered(6) ordered(7) ordered(8) ordered(9) ordered(10) ordered(11) ordered(12) ordered(13) ordered(14) ordered(15) ordered(16) ordered(17) ordered(18) ordered(19)
Problem is: you have a single resource all threads try to access. 问题是:您只有一个资源,所有线程都尝试访问。 Those single resources must be protected against concurrent access (thread safe resources do this, too, just transparently for you; by the way: here is a nice answer about thread safety of std::cout).
必须保护这些单一资源免受并发访问(线程安全资源也这样做,只是对您透明;顺便说一句: 这是关于std :: cout线程安全的一个不错的答案)。 You could now protect this single resource eg with a
std::mutex
. 您现在可以使用
std::mutex
保护该单一资源。 Problem then is, that the threads will have to wait for the mutex until the other thread gives it back again. 然后的问题是,这些线程将不得不等待互斥体,直到另一个线程再次将其返回。 So you only will profit from parallelisation if F is a very complex function.
因此,只有在F是一个非常复杂的函数时,您才能从并行化中受益。
Further drawback: as threads work parallel, even with a mutex to protect std::in, the results can be printed out in arbitrary order, depending on which thread happens to operate earlier. 进一步的缺点:由于线程并行工作,即使使用互斥量来保护std :: in,结果也可以按任意顺序打印出来,具体取决于哪个线程较早运行。
If I may assume that you want the results of F(... i) for smaller i before the results of greater i, you either should drop parallelisation entirely or do it differently: 如果我假设您希望在较小的i之前让F(... i)的结果在较大i的结果之前,则应该完全放弃并行化,或者以不同的方式进行:
Provide an array of size N
and let each thread store its results there ( array[i] = f(i);
). 提供一个大小为
N
的数组,并让每个线程将其结果存储在那里( array[i] = f(i);
)。 Then iterate over the array in a separate non-parallel loop. 然后在单独的非并行循环中遍历数组。 Again, doing so is only worth the effort if
F
is a complex function (and for large N). 同样,只有在
F
是一个复数函数(对于N大)时,这样做才值得付出努力。
Additionally: Be aware that threads must be created, too, which causes some overhead somewhere (creating thread infrastructure and stack, registering thread at OS, ... – unless if you can reuse some threads already created in a thread pool earlier...). 另外:请注意,也必须创建线程,这会在某些地方造成一些开销(创建线程基础结构和堆栈,在OS上注册线程,... –除非您可以重用早先在线程池中创建的某些线程... )。 Consider this, too, when deciding if you want to parallelise or not.
在决定是否要并行化时也要考虑这一点。 Sometimes, non-parallel calculations can be faster...
有时,非并行计算会更快...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.