[英]DO WHILE loop with OpenMP in Fortran
I am creating a random distribution of points in Fortran, and this is being done by a do while loop. 我在Fortran中创建了一个随机的点分布,这是通过do while循环完成的。 I want to speed up this process via OpenMP, but I read that you can't simply use !$OMP PARALLEL DO for do while loops.
我想通过OpenMP加速这个过程,但我读到你不能简单地使用!$ OMP PARALLEL DO for while while循环。 I tried converting my original do while into a do loop nested in the do while.
我尝试将我原来的do转换为do while嵌套的do循环。 However, I can't see any speedups in the code,by this I mean it takes the same time as the serial version.
但是,我在代码中看不到任何加速,我的意思是它需要与串行版本相同的时间。 I can't seem to figure out what the issue is and I've been stuck, would appreciate any advice.
我似乎无法弄清楚问题是什么,我一直被困住,不胜感激任何建议。 I've shown the code below.
我已经展示了下面的代码。
The original loop: 原始循环:
!OMP PARALLEL DO
do while (count < size(zeta_list,2))
call random_number(x)
call random_number(y)
x1 = a + FLOOR((b+1-a)*x)
y1 = a + FLOOR((b+1-a)*y)
if (abs(y1) <= abs(1/x1)) then
count = count + 1
call random_number(theta)
zeta_list(1,count) = x1*sin(2*pi_16*theta)
zeta_list(2,count) = x1*cos(2*pi_16*theta)
end if
end do
!OMP END PARALLEL DO
and after I tried to convert it, 在我试图转换之后
!$OMP PARALLEL
do while (count < size(zeta_list,2))
!$OMP DO
do i=1,size(zeta_list,2),1
call random_number(x)
call random_number(y)
x1 = a + FLOOR((b+1-a)*x)
y1 = a + FLOOR((b+1-a)*y)
if (abs(y1) <= abs(1/x1)) then
call random_number(theta)
count = count + 1
zeta_list(1,i) = x1*sin(2*pi_16*theta)
zeta_list(2,i) = x1*cos(2*pi_16*theta)
end if
end do
!$OMP END DO
end do
!$OMP END PARALLEL
The entire code is 整个代码是
PROGRAM RANDOM_DISTRIBUTION
IMPLICIT NONE
DOUBLE PRECISION, DIMENSION(2,1000000)::zeta_list
DOUBLE PRECISION::x,y,x1,y1,theta
REAL::a,b,n
INTEGER::count,t1,t2,clock_rate,clock_max,i
DOUBLE PRECISION,PARAMETER::pi_16=4*atan(1.0_16)
call system_clock ( t1, clock_rate, clock_max )
n = 1000
b = n/2
a = -n/2
count = 0
zeta_list = 0
x = 0
y = 0
x1 = 0
y1 = 0
theta = 0
call random_seed()
!$OMP PARALLEL
do while (count < size(zeta_list,2))
!$OMP DO
do i=1,size(zeta_list,2),1
call random_number(x)
call random_number(y)
x1 = a + FLOOR((b+1-a)*x)
y1 = a + FLOOR((b+1-a)*y)
if (abs(y1) <= abs(1/x1)) then
call random_number(theta)
count = count + 1
zeta_list(1,i) = x1*sin(2*pi_16*theta)
zeta_list(2,i) = x1*cos(2*pi_16*theta)
end if
end do
!$OMP END DO
end do
!$OMP END PARALLEL
call system_clock ( t2, clock_rate, clock_max )
write ( *, * ) 'Elapsed real time = ', real ( t2 - t1 ) / real ( clock_rate) ,'seconds'
stop
END PROGRAM RANDOM_DISTRIBUTION
compiled with gfortran test.f90 -fopenmp 用gfortran test.f90 -fopenmp编译
Instead of performing a hard-to distribute while loop, I propose the following: use a loop over the array index. 我没有执行难以分发的while循环,而是提出以下建议:在数组索引上使用循环。
I suppose that you want to generate random samples in the array zeta_list
. 我想你想在数组
zeta_list
生成随机样本。 I moved the while in the parallel loop. 我在平行循环中移动了一会儿。
Still, beware that you need a "OpenMP-aware" PRNG. 不过,请注意您需要一个“支持OpenMP”的PRNG。 This is the case in recent gfortran versions, I don't know for other compilers.
在最近的gfortran版本中就是这种情况,我不知道其他编译器。
I also changed the 1.0_16 into aa 1.0d0
as fixed numeric constants are not a good way to specify the kind parameter in general and reduced the size of the static array. 我还将1.0_16更改为a
1.0d0
因为固定数字常量不是一般指定kind参数并减小静态数组大小的好方法。
PROGRAM RANDOM_DISTRIBUTION
IMPLICIT NONE
DOUBLE PRECISION, DIMENSION(2,100000)::zeta_list
DOUBLE PRECISION::x,y,x1,y1,theta
REAL::a,b,n
INTEGER::count,t1,t2,clock_rate,clock_max,i
DOUBLE PRECISION,PARAMETER::pi_16=4*atan(1.0d0)
call system_clock ( t1, clock_rate, clock_max )
n = 1000
b = n/2
a = -n/2
count = 0
zeta_list = 0
x = 0
y = 0
x1 = 0
y1 = 0
theta = 0
call random_seed()
!$OMP PARALLEL DO private(i, x, y, x1, y1, theta)
do i = 1, size(zeta_list, 2)
inner_loop: do
call random_number(x)
call random_number(y)
x1 = a + FLOOR((b+1-a)*x)
y1 = a + FLOOR((b+1-a)*y)
if (abs(y1) <= abs(1/x1)) then
call random_number(theta)
zeta_list(1,i) = x1*sin(2*pi_16*theta)
zeta_list(2,i) = x1*cos(2*pi_16*theta)
exit inner_loop
end if
end do inner_loop
end do
!$OMP END PARALLEL DO
write(*,*) zeta_list(:,1)
write(*,*) zeta_list(:,2)
call system_clock ( t2, clock_rate, clock_max )
write ( *, * ) 'Elapsed real time = ', real ( t2 - t1 ) / real ( clock_rate) ,'seconds'
END PROGRAM RANDOM_DISTRIBUTION
The use of random_number
in OpenMP threads is safe for gfortran 5 but you need gfortran 7 to get a threaded random number generator. 在OpenMP线程中使用
random_number
对gfortran 5是安全的,但是你需要gfortran 7来获得一个线程随机数生成器。 I list the timing with two cores: 我列出了两个核心的时间:
user@pc$ gfortran-5 -O3 -Wall -fopenmp -o prd prd.f90
user@pc$ OMP_NUM_THREADS=1 ./prd
47.496326386583306 237.29327630545950
-101.11803913888293 147.70288474064185
Elapsed real time = 3.47700000 seconds
user@pc$ OMP_NUM_THREADS=2 ./prd
0.0000000000000000 -0.0000000000000000
-160.53394672041205 49.526275353269853
Elapsed real time = 12.1479998 seconds
user@pc$ rm fort.1*
user@pc$ gfortran-5 -O3 -Wall -fopenmp -o prd prd.f90
user@pc$ OMP_NUM_THREADS=1 ./prd
Elapsed real time = 3.05100012 seconds
user@pc$ OMP_NUM_THREADS=2 ./prd
Elapsed real time = 9.09599972 seconds
user@pc$ gfortran-6 -O3 -Wall -fopenmp -o prd prd.f90
user@pc$ OMP_NUM_THREADS=1 ./prd
Elapsed real time = 3.09200001 seconds
user@pc$ OMP_NUM_THREADS=2 ./prd
Elapsed real time = 12.3350000 seconds
user@pc$ gfortran-7 -O3 -Wall -fopenmp -o prd prd.f90
user@pc$ OMP_NUM_THREADS=1 ./prd
Elapsed real time = 1.83200002 seconds
user@pc$ OMP_NUM_THREADS=2 ./prd
Elapsed real time = 0.986999989 seconds
The result is quite obvious: prior to gfortran 7 OpenMP-ing the code here slows it down significantly. 结果非常明显:在gfortran 7开放之前,这里的代码显着减慢了速度。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.