简体   繁体   English

尝试在 OpenMP 中为 Fortran 并行化一个循环,以解决圆柱坐标系中的拉普拉斯算子

[英]Trying to parallelize a loop in OpenMP for Fortran that solves the laplacian in cylindrical coordinates

Apologies if the code is not great I am fairly new.如果代码不是很好,我深表歉意,我是相当新的。 I have the following code and I am unsure how to make it parallel via OpenMP, I have tried leaving the array (phi_test) as a shared variable which gives me a correct answer for a printed point however runs much slower ('this is point (60, 50) = 221.84875522778384') and the final 6-7 decimals vary each time, I have also tried setting the array as a reduction as I thought with prior to setting the boundary conditions, phi_test looks like it could be done via reduction.我有以下代码,我不确定如何通过 OpenMP 使其并行,我尝试将数组 (phi_test) 保留为共享变量,这为我提供了打印点的正确答案,但运行速度慢得多('这是点( 60, 50) = 221.84875522778384') 最后的 6-7 位小数每次都不同,我还尝试将数组设置为缩减,就像我在设置边界条件之前所想的那样,phi_test 看起来可以通过缩减来完成。 This again produces ('this is point (60, 50) = 221.84874961635666') where all of the decimals past dp 4 are now different but are consistent every time.这再次产生 ('this is point (60, 50) = 221.84874961635666') 其中所有超过 dp 4 的小数现在都不同但每次都是一致的。 I also plan to add a convergence condition to this loop and am wondering if this will be possible to do within the loop or whether I should do this outside of the loop.我还计划向该循环添加一个收敛条件,并想知道这是否可以在循环内完成,或者我是否应该在循环外执行此操作。

I was just curious if anyone could point me or help me with the correct method to make this loop parallel however also appreciate any coding tips.我只是好奇是否有人可以指出我或帮助我使用正确的方法来使这个循环并行,但也感谢任何编码技巧。 If you need me to attach the rest of the code please just let me know, thank you.如果您需要我附上代码的rest,请告诉我,谢谢。

! Making the loop parallel using OpenMP, I am currently struggling with this part 
!$omp parallel do default(none) shared(r_outer, l_outer, r_inner, l_inner, phi_initial, &
  !$omp seed, free_parameter, phi_test) private(rand_z, rand_r, outerloopiteration, U_potential) 
  !!$omp reduction(+: phi_test)
  bigloop : do outerloopiteration = 1, max_iters
   
    ! Creating two random numbers and converting them to integers to update grid points arbitrarily
    rand_z = nint(r_num(seed) * real(l_outer, kind=dp))
    rand_r = nint(r_num(seed) * real(r_outer, kind=dp))
   
    ! The code to solve the problem, split separately into two conditions for if rand_r = 0 or not      
    if  (rand_r == 0) then
      U_potential = 2.0_dp/3.0_dp * phi_test(1, rand_z) + (1.0_dp / 6.0_dp) * (phi_test(0, rand_z+1)&
       + phi_test(0, rand_z-1))
    else
      U_potential = 1.0_dp/4.0_dp * (phi_test(rand_r+1, rand_z) + phi_test(rand_r-1, rand_z) &
       + phi_test(rand_r, rand_z+1) + phi_test(rand_r, rand_z-1)) + &
      (1.0_dp/(8.0_dp*rand_r)) * (phi_test(rand_r+1, rand_z) - phi_test(rand_r-1, rand_z))
    end if
    
    ! Adjusting the new value at the random point using a free parameter to adjust convergence
    phi_test(rand_r, rand_z) = phi_test(rand_r, rand_z) + free_parameter * &
    (U_potential - phi_test(rand_r, rand_z))  
         
    ! Boundary conditions
    phi_test(0 , 0:l_inner) = phi_initial
    phi_test(0:r_inner, 0) = phi_initial
    phi_test(1:r_inner, 1:l_inner) = phi_initial
           
  end do bigloop
  !$omp end parallel do

I have tried moving the main section of the loop into its own subroutine however was then receiving a NaN for the point that I was trying to print.我曾尝试将循环的主要部分移动到它自己的子例程中,但随后收到了一个 NaN 表示我试图打印的点。 I have also tried using reduction and shared for the array however as previously described this produces different values for the answer.我也尝试过对数组使用缩减和共享,但是如前所述,这会产生不同的答案值。 I am currently looking further into different ways of doing this and looking further into the OpenMP syntax.我目前正在进一步研究执行此操作的不同方法,并进一步研究 OpenMP 语法。

Arbitrary updating of grid points is called "chaotic iteration" and 1. no one does that because it doesn't work terribly well (if at all) and 2. it's very hard to parallelize.网格点的任意更新称为“混乱迭代”,1. 没有人这样做,因为它不能很好地工作(如果有的话),2. 很难并行化。

For an easy to code solution I suggest you look into a Jacobi iteration which is fully parallel, or SOR / Gauss-Seidel, which can be parallelized through multi-coloring or wavefronts.对于易于编码的解决方案,我建议您查看完全并行的 Jacobi 迭代,或 SOR / Gauss-Seidel,它们可以通过多色或波阵面并行化。 Even better would be a conjugate gradient method which converges much faster, but it's a little more coding.更好的方法是共轭梯度法,它收敛得更快,但需要更多的编码。

For much background information, see my textbook https://theartofhpc.com/istc.html有关更多背景信息,请参阅我的教科书https://theartofhpc.com/istc.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM