[英]Illegal instruction error when running openMP in gfortran mac
I am new to openMP so bear with me. 我是openMP的新手,所以请多多包涵。 I use a macbook pro with OX Mavericks and installed a version of gfortran compatible with my OS.
我将Macbook pro与OX Mavericks结合使用,并安装了与我的操作系统兼容的gfortran版本。
I implemented a very simple dynamic programming problem to solve something that economist call the Neoclassical Growth Model. 我实现了一个非常简单的动态规划问题,以解决经济学家称之为新古典增长模型的问题。
I find no problem when I run it without openMP, however when trying to compile the program with -fopenmp option I get either 我在没有openMP的情况下运行它没有问题,但是当尝试使用-fopenmp选项编译程序时,我得到了
Illegal Instruction: 4 or Segmentation Fault: 11 非法指令:4或分段错误:11
... probably I am doing something very wrong. ...可能我做错了什么。
I attach the main program, subroutines, modules and .sh compilation file 我附加主程序,子例程,模块和.sh编译文件
PROGRAM prg_dp1
! PRG_DP1 SOLVES THE GROWTH MODEL BY VALUE FUNCTION ITERATION OVER A DISCRETE GRID
! WITHOUT INTERPOLATION. EVALUATION IS DONE OVER NEXT PERIOD CAPITAL
! PROBLEM: PROGRAMMED AS MATRICES IT LIMITS THE SIZE OF THE PROGRAM BEFORE !SEGMENTATION FAULTS OCCUR
USE modvar
IMPLICIT NONE
REAL(DP), DIMENSION(nk) :: v,v0,kp,c
REAL(DP), DIMENSION(nk,nk) :: cm,um,vm
REAL(DP) :: kstar,tbegin,tend
INTEGER :: it,ik1,ik2,ind(nk)
! INVOCATION OF THE OUTPUT FILES WHERE THE INFORMATION IS GOING TO BE WRITTEN
CALL CPU_TIME(tbegin)
! DEFINITION OF THE PARAMETERS OF THE MODEL
p(1)=1.0001 ! Intertemporal elasticity of substitution (SIGMA)
p(2)=0.96 ! Intertemporal discount factor (BETA)
p(3)=0.06 ! Depreciation rate (DELTA)
p(4)=0.36 ! Share of capital in production (ALPHA)
p(5)=0.00 ! (Parameter not needed)
! COMPUTATION OF THE STEADY STATE CAPITAL STOCK
kstar=((1.0/p(2)-(1.0-p(3)))/p(4))**(1.0/(p(4)-1.0))
! FIRST I ALLOCATE AND CONSTRUCT THE GRID
slope=1.0
gkmin=0.0001
gkmax=5.0*kstar
! ALLOCATE(gk(nk),ones(nk,nk))
ALLOCATE(gk(nk))
! ones=1.0
CALL sub_grid_generation(gk,nk,slope,gkmin,gkmax)
! DEFINITION OF THE MATRICES OF CONSUMPTION AND UTILITY
!$OMP PARALLEL DEFAULT(SHARED) PRIVATE(ik1,ik2)
!$OMP DO SCHEDULE(DYNAMIC)
DO ik1=1,nk
DO ik2=1,nk
cm(ik1,ik2)=gk(ik1)**p(4)+(1.0-p(3))*gk(ik1)-gk(ik2)
END DO
END DO
!$OMP END DO
!$OMP END PARALLEL
! cm = gk**p(4)+(1.0-p(3))*gk-gk*ones
WHERE (cm .le. 0.0)
um=-1.0e+6
ELSEWHERE
um=(cm**(1.0-p(1))-1.0)/(1.0-p(1))
END WHERE
! DINAMIC PROGRAMMING STEP
! I first initialize the value function to zeros
v0=0.0
! Main do has to be done by master-thread ... can I parallelize more?
DO
!$OMP PARALLEL DO PRIVATE(ik2)
DO ik2=1,nk
vm(:,ik2)=um(:,ik2)+p(2)*v0(ik2)
END DO
!$OMP END PARALLEL DO
v=MAXVAL(vm,DIM=2)
print *, MAXVAL(ABS(v-v0))
IF (MAXVAL(ABS(v-v0)) .le. dp_tol) THEN
EXIT
ELSE
v0=v
END IF
END DO
ind=MAXLOC(v,DIM=1)
kp=gk(ind)
c=gk**p(4)+(1.0-p(3))*gk-kp
open(unit=1,file='output.txt')
DO ik1=1,nk
write(1,'(4F10.5)') gk(ik1),v(ik1),kp(ik1),c(ik1)
END DO
close(1)
DEALLOCATE(gk)
CALL CPU_TIME(tend)
PRINT *, tend-tbegin
END PROGRAM prg_dp1
SUBROUTINE sub_grid_generation(grid,gsize,slope,gridmin,gridmax)
USE nrtype
INTEGER, INTENT(IN) :: gsize
REAL(DP), INTENT(IN) :: slope,gridmin,gridmax
REAL(DP), INTENT(OUT) :: grid(gsize)
INTEGER :: ig
grid(1)=gridmin
DO ig=2,gsize
grid(ig)=gridmin+((gridmax-gridmin)/dfloat(gsize)**slope)*dfloat(ig)**slope
END DO
END SUBROUTINE sub_grid_generation END SUBROUTINE sub_grid_generation
MODULE nrtype
INTEGER, PARAMETER :: I4B = SELECTED_INT_KIND(9)
INTEGER, PARAMETER :: I2B = SELECTED_INT_KIND(4)
INTEGER, PARAMETER :: I1B = SELECTED_INT_KIND(2)
INTEGER, PARAMETER :: SP = KIND(1.0)
INTEGER, PARAMETER :: DP = KIND(1.0D0)
INTEGER, PARAMETER :: SPC = KIND((1.0,1.0))
INTEGER, PARAMETER :: DPC = KIND((1.0D0,1.0D0))
INTEGER, PARAMETER :: LGT = KIND(.true.)
REAL(SP), PARAMETER :: PI=3.141592653589793238462643383279502884197_sp
REAL(SP), PARAMETER :: PIO2=1.57079632679489661923132169163975144209858_sp
REAL(SP), PARAMETER :: TWOPI=6.283185307179586476925286766559005768394_sp
REAL(SP), PARAMETER :: SQRT2=1.41421356237309504880168872420969807856967_sp
REAL(SP), PARAMETER :: EULER=0.5772156649015328606065120900824024310422_sp
REAL(DP), PARAMETER :: PI_D=3.141592653589793238462643383279502884197_dp
REAL(DP), PARAMETER :: PIO2_D=1.57079632679489661923132169163975144209858_dp
REAL(DP), PARAMETER :: TWOPI_D=6.283185307179586476925286766559005768394_dp
REAL(DP), PARAMETER :: gr=(5.0**0.5-1.0)/2.0
TYPE sprs2_sp
INTEGER(I4B) :: n,len
REAL(SP), DIMENSION(:), POINTER :: val
INTEGER(I4B), DIMENSION(:), POINTER :: irow
INTEGER(I4B), DIMENSION(:), POINTER :: jcol
END TYPE sprs2_sp
TYPE sprs2_dp
INTEGER(I4B) :: n,len
REAL(DP), DIMENSION(:), POINTER :: val
INTEGER(I4B), DIMENSION(:), POINTER :: irow
INTEGER(I4B), DIMENSION(:), POINTER :: jcol
END TYPE sprs2_dp
END MODULE nrtype
MODULE modvar
USE nrtype
IMPLICIT NONE
REAL(DP), PARAMETER :: r_tol=1e-8
REAL(DP), PARAMETER :: p_tol=1e-6
REAL(DP), PARAMETER :: dp_tol=1e-6
REAL(DP), PARAMETER :: c_tol=0.001
REAL(DP), PARAMETER :: adj=0.5
INTEGER, PARAMETER :: r_m=10000
! PARAMETER THAT DEFINE THE DIMENSION OF THE PROBLEM
INTEGER, PARAMETER :: nk=5000
INTEGER, PARAMETER :: nz=21
INTEGER, PARAMETER :: np=20000
INTEGER, PARAMETER :: nt=5000
INTEGER, PARAMETER :: maxit=10000
INTEGER, PARAMETER :: dist_maxit=5000
! POLICY PARAMETER, COMMON ENDOGENOUS VARIABLES AND OTHER ALLOCATABLE ARRAYS
REAL(DP), PARAMETER :: nw=0.0
REAL(DP), PARAMETER :: ft=0.33
REAL(DP) :: p(5),gkmin,gkmax,slope
REAL(DP), ALLOCATABLE :: gk(:),gz(:),m(:,:),mss(:),ones(:,:)
END MODULE modvar
and the .sh file I use to compile 和我用来编译的.sh文件
export OMP_NUM_THREADS=8
gfortran -O2 -fopenmp -c nrtype.f90 modvar.f90 sub_grid_generation.f90 prg_dp1.f90
gfortran -O2 -fopenmp -fcheck=bounds -o myprog nrtype.o modvar.o sub_grid_generation.o prg_dp1.o
I know this is tedious but I would appreciate some help 我知道这很乏味,但我会有所帮助
Thank you 谢谢
The other options is to make the global arrays cm,um,vm
, and possibly also the smaller other ones, allocatable. 其他选项是使全局数组
cm,um,vm
,也可能使较小的其他数组可分配。 This will become handy when you change the problem size, read it from somewhere and maintain the executable. 当您更改问题大小,从某个地方读取问题并维护可执行文件时,这将变得很方便。
REAL(DP), DIMENSION(:,:),allocatable :: cm,um,vm
allocate(cm(nk,nk),um(nk,nk),vm(nk,nk))
It is a stack space issue. 这是一个堆栈空间问题。 I tried running it with ifort and even without openmp I get illegal instruction and I had to specify -heap-arrays in order to get it to run properly.
我尝试使用ifort运行它,即使没有openmp,我也会收到非法指令,并且必须指定-heap-arrays才能使其正常运行。 Once I added openmp the illegal instruction error came back.
添加openmp后,非法指令错误又回来了。 The
WHERE
statement seems to be the problem code. WHERE
语句似乎是问题代码。 In both the openmp and non-openmp runs that is the part that causes it to fail 在openmp和non-openmp运行中,都是导致失败的部分
OS X stack space is rather limited and you are creating large arrays. OS X堆栈空间相当有限,您正在创建大型阵列。 Using -heap-arrays helps, but once you use openmp that is no longer a possibility and ulimit is maxed out as ~64 MB.
使用-heap-arrays会有所帮助,但是一旦使用openmp,就不再有可能,并且ulimit的最大值约为64 MB。
I found adding this to your compilation works: 我发现将其添加到您的编译作品中:
-Wl,-stack_size,0x40000000,-stack_addr,0xf0000000
Which increases the stack size to 1GB. 这会将堆栈大小增加到1GB。 This could probably be fine tuned, but I tried using 256 MB and it was still not enough.
可能对此进行了微调,但是我尝试使用256 MB,但仍然不够用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.