简体   繁体   English

如何从不同的线程,OpenMP,C ++写入文件

[英]How to write to file from different threads, OpenMP, C++

I use openMP for parallel my C++ program. 我使用openMP并行执行我的C ++程序。 My parallel code have very simple form 我的并行代码格式很简单

#pragma omp parallel for shared(a, b, c) private(i, result)
        for (i = 0; i < N; i++){
         result= F(a,b,c,i)//do some calculation
         cout<<i<<" "<<result<<endl;
         }

If two threads try to write into file simultaneously, the data is mixed up. 如果两个线程试图同时写入文件,则数据会混合在一起。 How I can solve this problem? 我该如何解决这个问题?

OpenMP provides pragmas to help with synchronisation. OpenMP提供了实用程序来帮助进行同步。 #pragma omp critical allows only one thread to be executing the attached statement at any time (a mutual exclusion critical region). #pragma omp critical允许任何时刻(互斥关键区域)仅一个线程执行附加的语句。 The #pragma omp ordered pragma ensures loop iteration threads enter the region in order. #pragma omp ordered pragma确保循环迭代线程按顺序进入该区域。

// g++ -std=c++11 -Wall -Wextra -pedantic -fopenmp critical.cpp
#include <iostream>

int main()
{
  #pragma omp parallel for
  for (int i = 0; i < 20; ++i)
    std::cout << "unsynchronized(" << i << ") ";
  std::cout << std::endl;
  #pragma omp parallel for
  for (int i = 0; i < 20; ++i)
    #pragma omp critical
    std::cout << "critical(" << i << ") ";
  std::cout << std::endl;
  #pragma omp parallel for ordered
  for (int i = 0; i < 20; ++i)
    #pragma omp ordered
    std::cout << "ordered(" << i << ") ";
  std::cout << std::endl;
  return 0;
}

Example output (different each time in general): 输出示例(通常每次都不同):

unsynchronized(unsynchronized(unsynchronized(05) unsynchronized() 6unsynchronized() 1unsynchronized(7) ) unsynchronized(unsynchronized(28) ) unsynchronized(unsynchronized(93) ) unsynchronized(4) 10) unsynchronized(11) unsynchronized(12) unsynchronized(15) unsynchronized(16unsynchronized() 13unsynchronized() 17) unsynchronized(unsynchronized(18) 14unsynchronized() 19) 
critical(5) critical(0) critical(6) critical(15) critical(1) critical(10) critical(7) critical(16) critical(2) critical(8) critical(17) critical(3) critical(9) critical(18) critical(11) critical(4) critical(19) critical(12) critical(13) critical(14) 
ordered(0) ordered(1) ordered(2) ordered(3) ordered(4) ordered(5) ordered(6) ordered(7) ordered(8) ordered(9) ordered(10) ordered(11) ordered(12) ordered(13) ordered(14) ordered(15) ordered(16) ordered(17) ordered(18) ordered(19) 

Problem is: you have a single resource all threads try to access. 问题是:您只有一个资源,所有线程都尝试访问。 Those single resources must be protected against concurrent access (thread safe resources do this, too, just transparently for you; by the way: here is a nice answer about thread safety of std::cout). 必须保护这些单一资源免受并发访问(线程安全资源也这样做,只是对您透明;顺便说一句: 是关于std :: cout线程安全的一个不错的答案)。 You could now protect this single resource eg with a std::mutex . 您现在可以使用std::mutex保护该单一资源。 Problem then is, that the threads will have to wait for the mutex until the other thread gives it back again. 然后的问题是,这些线程将不得不等待互斥体,直到另一个线程再次将其返回。 So you only will profit from parallelisation if F is a very complex function. 因此,只有在F是一个非常复杂的函数时,您才能从并行化中受益。

Further drawback: as threads work parallel, even with a mutex to protect std::in, the results can be printed out in arbitrary order, depending on which thread happens to operate earlier. 进一步的缺点:由于线程并行工作,即使使用互斥量来保护std :: in,结果也可以按任意顺序打印出来,具体取决于哪个线程较早运行。

If I may assume that you want the results of F(... i) for smaller i before the results of greater i, you either should drop parallelisation entirely or do it differently: 如果我假设您希望在较小的i之前让F(... i)的结果在较大i的结果之前,则应该完全放弃并行化,或者以不同的方式进行:

Provide an array of size N and let each thread store its results there ( array[i] = f(i); ). 提供一个大小为N的数组,并让每个线程将其结果存储在那里( array[i] = f(i); )。 Then iterate over the array in a separate non-parallel loop. 然后在单独的非并行循环中遍历数组。 Again, doing so is only worth the effort if F is a complex function (and for large N). 同样,只有在F是一个复数函数(对于N大)时,这样做才值得付出努力。

Additionally: Be aware that threads must be created, too, which causes some overhead somewhere (creating thread infrastructure and stack, registering thread at OS, ... – unless if you can reuse some threads already created in a thread pool earlier...). 另外:请注意,也必须创建线程,这会在某些地方造成一些开销(创建线程基础结构和堆栈,在OS上注册线程,... –除非您可以重用早先在线程池中创建的某些线程... )。 Consider this, too, when deciding if you want to parallelise or not. 在决定是否要并行化时也要考虑这一点。 Sometimes, non-parallel calculations can be faster... 有时,非并行计算会更快...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM