简体   繁体   English

在Fortran中使用OpenMP进行WH WHILE循环

[英]DO WHILE loop with OpenMP in Fortran

I am creating a random distribution of points in Fortran, and this is being done by a do while loop. 我在Fortran中创建了一个随机的点分布,这是通过do while循环完成的。 I want to speed up this process via OpenMP, but I read that you can't simply use !$OMP PARALLEL DO for do while loops. 我想通过OpenMP加速这个过程,但我读到你不能简单地使用!$ OMP PARALLEL DO for while while循环。 I tried converting my original do while into a do loop nested in the do while. 我尝试将我原来的do转换为do while嵌套的do循环。 However, I can't see any speedups in the code,by this I mean it takes the same time as the serial version. 但是,我在代码中看不到任何加速,我的意思是它需要与串行版本相同的时间。 I can't seem to figure out what the issue is and I've been stuck, would appreciate any advice. 我似乎无法弄清楚问题是什么,我一直被困住,不胜感激任何建议。 I've shown the code below. 我已经展示了下面的代码。

The original loop: 原始循环:

!OMP PARALLEL DO
do while (count < size(zeta_list,2))
    call random_number(x)
    call random_number(y)
    x1 = a + FLOOR((b+1-a)*x)
    y1 = a + FLOOR((b+1-a)*y)
    if (abs(y1) <= abs(1/x1)) then
        count = count + 1
        call random_number(theta)
        zeta_list(1,count) = x1*sin(2*pi_16*theta)
        zeta_list(2,count) = x1*cos(2*pi_16*theta)
    end if 
end do  
!OMP END PARALLEL DO

and after I tried to convert it, 在我试图转换之后

!$OMP PARALLEL 
do while (count < size(zeta_list,2))
    !$OMP DO
    do i=1,size(zeta_list,2),1
        call random_number(x)
        call random_number(y)
        x1 = a + FLOOR((b+1-a)*x)
        y1 = a + FLOOR((b+1-a)*y)
        if (abs(y1) <= abs(1/x1)) then
            call random_number(theta)
            count = count + 1
            zeta_list(1,i) = x1*sin(2*pi_16*theta)
            zeta_list(2,i) = x1*cos(2*pi_16*theta)
        end if
    end do
    !$OMP END DO 
end do  
!$OMP END PARALLEL

The entire code is 整个代码是

PROGRAM RANDOM_DISTRIBUTION

IMPLICIT NONE 

DOUBLE PRECISION, DIMENSION(2,1000000)::zeta_list
DOUBLE PRECISION::x,y,x1,y1,theta
REAL::a,b,n
INTEGER::count,t1,t2,clock_rate,clock_max,i
DOUBLE PRECISION,PARAMETER::pi_16=4*atan(1.0_16)

call system_clock ( t1, clock_rate, clock_max )

n = 1000
b = n/2
a = -n/2
count = 0
zeta_list = 0
x = 0
y = 0
x1 = 0 
y1 = 0 
theta = 0

call random_seed()



!$OMP PARALLEL 
do while (count < size(zeta_list,2))
    !$OMP DO
    do i=1,size(zeta_list,2),1
        call random_number(x)
        call random_number(y)
        x1 = a + FLOOR((b+1-a)*x)
        y1 = a + FLOOR((b+1-a)*y)
        if (abs(y1) <= abs(1/x1)) then
            call random_number(theta)
            count = count + 1
            zeta_list(1,i) = x1*sin(2*pi_16*theta)
            zeta_list(2,i) = x1*cos(2*pi_16*theta)
        end if
    end do
    !$OMP END DO 
end do  
!$OMP END PARALLEL


call system_clock ( t2, clock_rate, clock_max )
write ( *, * ) 'Elapsed real time = ', real ( t2 - t1 ) / real ( clock_rate) ,'seconds' 


stop
END PROGRAM RANDOM_DISTRIBUTION

compiled with gfortran test.f90 -fopenmp 用gfortran test.f90 -fopenmp编译

Instead of performing a hard-to distribute while loop, I propose the following: use a loop over the array index. 我没有执行难以分发的while循环,而是提出以下建议:在数组索引上使用循环。

I suppose that you want to generate random samples in the array zeta_list . 我想你想在数组zeta_list生成随机样本。 I moved the while in the parallel loop. 我在平行循环中移动了一会儿。

Still, beware that you need a "OpenMP-aware" PRNG. 不过,请注意您需要一个“支持OpenMP”的PRNG。 This is the case in recent gfortran versions, I don't know for other compilers. 在最近的gfortran版本中就是这种情况,我不知道其他编译器。

I also changed the 1.0_16 into aa 1.0d0 as fixed numeric constants are not a good way to specify the kind parameter in general and reduced the size of the static array. 我还将1.0_16更改为a 1.0d0因为固定数字常量不是一般指定kind参数并减小静态数组大小的好方法。

PROGRAM RANDOM_DISTRIBUTION

  IMPLICIT NONE 

  DOUBLE PRECISION, DIMENSION(2,100000)::zeta_list
  DOUBLE PRECISION::x,y,x1,y1,theta
  REAL::a,b,n
  INTEGER::count,t1,t2,clock_rate,clock_max,i
  DOUBLE PRECISION,PARAMETER::pi_16=4*atan(1.0d0)

  call system_clock ( t1, clock_rate, clock_max )

  n = 1000
  b = n/2
  a = -n/2
  count = 0
  zeta_list = 0
  x = 0
  y = 0
  x1 = 0 
  y1 = 0 
  theta = 0

  call random_seed()

  !$OMP PARALLEL DO private(i, x, y, x1, y1, theta)
  do i = 1, size(zeta_list, 2)
     inner_loop: do
        call random_number(x)
        call random_number(y)
        x1 = a + FLOOR((b+1-a)*x)
        y1 = a + FLOOR((b+1-a)*y)
        if (abs(y1) <= abs(1/x1)) then
           call random_number(theta)
           zeta_list(1,i) = x1*sin(2*pi_16*theta)
           zeta_list(2,i) = x1*cos(2*pi_16*theta)
           exit inner_loop
        end if
     end do inner_loop
  end do
  !$OMP END PARALLEL DO

  write(*,*) zeta_list(:,1)
  write(*,*) zeta_list(:,2)

  call system_clock ( t2, clock_rate, clock_max )
  write ( *, * ) 'Elapsed real time = ', real ( t2 - t1 ) / real ( clock_rate) ,'seconds' 

END PROGRAM RANDOM_DISTRIBUTION

The use of random_number in OpenMP threads is safe for gfortran 5 but you need gfortran 7 to get a threaded random number generator. 在OpenMP线程中使用random_number对gfortran 5是安全的,但是你需要gfortran 7来获得一个线程随机数生成器。 I list the timing with two cores: 我列出了两个核心的时间:

user@pc$ gfortran-5 -O3 -Wall -fopenmp -o prd prd.f90
user@pc$ OMP_NUM_THREADS=1 ./prd
   47.496326386583306        237.29327630545950     
  -101.11803913888293        147.70288474064185     
 Elapsed real time =    3.47700000     seconds
user@pc$ OMP_NUM_THREADS=2 ./prd
   0.0000000000000000       -0.0000000000000000     
  -160.53394672041205        49.526275353269853     
 Elapsed real time =    12.1479998     seconds
user@pc$ rm fort.1*
user@pc$ gfortran-5 -O3 -Wall -fopenmp -o prd prd.f90
user@pc$ OMP_NUM_THREADS=1 ./prd
 Elapsed real time =    3.05100012     seconds
user@pc$ OMP_NUM_THREADS=2 ./prd
 Elapsed real time =    9.09599972     seconds
user@pc$ gfortran-6 -O3 -Wall -fopenmp -o prd prd.f90
user@pc$ OMP_NUM_THREADS=1 ./prd
 Elapsed real time =    3.09200001     seconds
user@pc$ OMP_NUM_THREADS=2 ./prd
 Elapsed real time =    12.3350000     seconds
user@pc$ gfortran-7 -O3 -Wall -fopenmp -o prd prd.f90
user@pc$ OMP_NUM_THREADS=1 ./prd
 Elapsed real time =    1.83200002     seconds
user@pc$ OMP_NUM_THREADS=2 ./prd
 Elapsed real time =   0.986999989     seconds

The result is quite obvious: prior to gfortran 7 OpenMP-ing the code here slows it down significantly. 结果非常明显:在gfortran 7开放之前,这里的代码显着减慢了速度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM