简体   繁体   English

在gfortran mac中运行openMP时出现非法指令错误

[英]Illegal instruction error when running openMP in gfortran mac

I am new to openMP so bear with me. 我是openMP的新手,所以请多多包涵。 I use a macbook pro with OX Mavericks and installed a version of gfortran compatible with my OS. 我将Macbook pro与OX Mavericks结合使用,并安装了与我的操作系统兼容的gfortran版本。

I implemented a very simple dynamic programming problem to solve something that economist call the Neoclassical Growth Model. 我实现了一个非常简单的动态规划问题,以解决经济学家称之为新古典增长模型的问题。

I find no problem when I run it without openMP, however when trying to compile the program with -fopenmp option I get either 我在没有openMP的情况下运行它没有问题,但是当尝试使用-fopenmp选项编译程序时,我得到了

Illegal Instruction: 4 or Segmentation Fault: 11 非法指令:4或分段错误:11

... probably I am doing something very wrong. ...可能我做错了什么。

I attach the main program, subroutines, modules and .sh compilation file 我附加主程序,子例程,模块和.sh编译文件

PROGRAM prg_dp1

    ! PRG_DP1 SOLVES THE GROWTH MODEL BY VALUE FUNCTION ITERATION OVER A DISCRETE GRID
    ! WITHOUT INTERPOLATION. EVALUATION IS DONE OVER NEXT PERIOD CAPITAL

    ! PROBLEM: PROGRAMMED AS MATRICES IT LIMITS THE SIZE OF THE PROGRAM BEFORE !SEGMENTATION FAULTS OCCUR

    USE modvar

    IMPLICIT NONE
    REAL(DP), DIMENSION(nk) :: v,v0,kp,c
    REAL(DP), DIMENSION(nk,nk) :: cm,um,vm
    REAL(DP) :: kstar,tbegin,tend
    INTEGER :: it,ik1,ik2,ind(nk)

    ! INVOCATION OF THE OUTPUT FILES WHERE THE INFORMATION IS GOING TO BE WRITTEN
    CALL CPU_TIME(tbegin)


    ! DEFINITION OF THE PARAMETERS OF THE MODEL

    p(1)=1.0001     ! Intertemporal elasticity of substitution (SIGMA)
    p(2)=0.96       ! Intertemporal discount factor (BETA)
    p(3)=0.06       ! Depreciation rate (DELTA)
    p(4)=0.36       ! Share of capital in production (ALPHA)
    p(5)=0.00       ! (Parameter not needed)

    ! COMPUTATION OF THE STEADY STATE CAPITAL STOCK

    kstar=((1.0/p(2)-(1.0-p(3)))/p(4))**(1.0/(p(4)-1.0))

    ! FIRST I ALLOCATE AND CONSTRUCT THE GRID

    slope=1.0
    gkmin=0.0001
    gkmax=5.0*kstar
  !  ALLOCATE(gk(nk),ones(nk,nk))

    ALLOCATE(gk(nk))
!   ones=1.0
    CALL sub_grid_generation(gk,nk,slope,gkmin,gkmax)

    ! DEFINITION OF THE MATRICES OF CONSUMPTION AND UTILITY

    !$OMP PARALLEL  DEFAULT(SHARED) PRIVATE(ik1,ik2)
    !$OMP DO SCHEDULE(DYNAMIC)
    DO ik1=1,nk
        DO ik2=1,nk
            cm(ik1,ik2)=gk(ik1)**p(4)+(1.0-p(3))*gk(ik1)-gk(ik2)
        END DO
    END DO
    !$OMP END DO
    !$OMP END PARALLEL

   ! cm = gk**p(4)+(1.0-p(3))*gk-gk*ones


    WHERE (cm .le. 0.0)
        um=-1.0e+6
    ELSEWHERE
        um=(cm**(1.0-p(1))-1.0)/(1.0-p(1))
    END WHERE

    ! DINAMIC PROGRAMMING STEP

    ! I first initialize the value function to zeros

    v0=0.0

    ! Main do has to be done by master-thread ... can I parallelize more?

    DO
        !$OMP PARALLEL DO PRIVATE(ik2)
        DO ik2=1,nk
            vm(:,ik2)=um(:,ik2)+p(2)*v0(ik2)
        END DO
        !$OMP END PARALLEL DO
        v=MAXVAL(vm,DIM=2)
        print *, MAXVAL(ABS(v-v0))
        IF (MAXVAL(ABS(v-v0)) .le. dp_tol) THEN
            EXIT
        ELSE
            v0=v
        END IF
    END DO

    ind=MAXLOC(v,DIM=1)

    kp=gk(ind)
    c=gk**p(4)+(1.0-p(3))*gk-kp
    open(unit=1,file='output.txt')
    DO ik1=1,nk
        write(1,'(4F10.5)') gk(ik1),v(ik1),kp(ik1),c(ik1)
    END DO
    close(1)
    DEALLOCATE(gk)

    CALL CPU_TIME(tend)

    PRINT *, tend-tbegin

END PROGRAM prg_dp1

SUBROUTINE sub_grid_generation(grid,gsize,slope,gridmin,gridmax)
    USE nrtype
INTEGER, INTENT(IN) :: gsize
    REAL(DP), INTENT(IN) :: slope,gridmin,gridmax
REAL(DP), INTENT(OUT) :: grid(gsize)
    INTEGER :: ig   
grid(1)=gridmin
    DO ig=2,gsize
    grid(ig)=gridmin+((gridmax-gridmin)/dfloat(gsize)**slope)*dfloat(ig)**slope
    END DO

END SUBROUTINE sub_grid_generation END SUBROUTINE sub_grid_generation

MODULE nrtype
    INTEGER, PARAMETER :: I4B = SELECTED_INT_KIND(9)
    INTEGER, PARAMETER :: I2B = SELECTED_INT_KIND(4)
    INTEGER, PARAMETER :: I1B = SELECTED_INT_KIND(2)
    INTEGER, PARAMETER :: SP = KIND(1.0)
    INTEGER, PARAMETER :: DP = KIND(1.0D0)
    INTEGER, PARAMETER :: SPC = KIND((1.0,1.0))
    INTEGER, PARAMETER :: DPC = KIND((1.0D0,1.0D0))
    INTEGER, PARAMETER :: LGT = KIND(.true.)
    REAL(SP), PARAMETER :: PI=3.141592653589793238462643383279502884197_sp
    REAL(SP), PARAMETER :: PIO2=1.57079632679489661923132169163975144209858_sp
    REAL(SP), PARAMETER :: TWOPI=6.283185307179586476925286766559005768394_sp
    REAL(SP), PARAMETER :: SQRT2=1.41421356237309504880168872420969807856967_sp
    REAL(SP), PARAMETER :: EULER=0.5772156649015328606065120900824024310422_sp
    REAL(DP), PARAMETER :: PI_D=3.141592653589793238462643383279502884197_dp
    REAL(DP), PARAMETER :: PIO2_D=1.57079632679489661923132169163975144209858_dp
    REAL(DP), PARAMETER :: TWOPI_D=6.283185307179586476925286766559005768394_dp
    REAL(DP), PARAMETER :: gr=(5.0**0.5-1.0)/2.0
    TYPE sprs2_sp
        INTEGER(I4B) :: n,len
        REAL(SP), DIMENSION(:), POINTER :: val
        INTEGER(I4B), DIMENSION(:), POINTER :: irow
        INTEGER(I4B), DIMENSION(:), POINTER :: jcol
    END TYPE sprs2_sp
    TYPE sprs2_dp
        INTEGER(I4B) :: n,len
        REAL(DP), DIMENSION(:), POINTER :: val
        INTEGER(I4B), DIMENSION(:), POINTER :: irow
        INTEGER(I4B), DIMENSION(:), POINTER :: jcol
    END TYPE sprs2_dp
END MODULE nrtype

MODULE modvar
    USE nrtype
    IMPLICIT NONE
    REAL(DP), PARAMETER :: r_tol=1e-8
    REAL(DP), PARAMETER :: p_tol=1e-6

    REAL(DP), PARAMETER :: dp_tol=1e-6
    REAL(DP), PARAMETER :: c_tol=0.001
    REAL(DP), PARAMETER :: adj=0.5

    INTEGER, PARAMETER :: r_m=10000

    ! PARAMETER THAT DEFINE THE DIMENSION OF THE PROBLEM

    INTEGER, PARAMETER :: nk=5000
    INTEGER, PARAMETER :: nz=21
    INTEGER, PARAMETER :: np=20000
    INTEGER, PARAMETER :: nt=5000

    INTEGER, PARAMETER :: maxit=10000

    INTEGER, PARAMETER :: dist_maxit=5000

    ! POLICY PARAMETER, COMMON ENDOGENOUS VARIABLES AND OTHER ALLOCATABLE ARRAYS

    REAL(DP), PARAMETER :: nw=0.0
    REAL(DP), PARAMETER :: ft=0.33
    REAL(DP) :: p(5),gkmin,gkmax,slope
    REAL(DP), ALLOCATABLE :: gk(:),gz(:),m(:,:),mss(:),ones(:,:)
END MODULE modvar

and the .sh file I use to compile 和我用来编译的.sh文件

export OMP_NUM_THREADS=8
gfortran -O2 -fopenmp -c nrtype.f90 modvar.f90 sub_grid_generation.f90 prg_dp1.f90

gfortran -O2 -fopenmp -fcheck=bounds -o myprog nrtype.o modvar.o sub_grid_generation.o prg_dp1.o

I know this is tedious but I would appreciate some help 我知道这很乏味,但我会有所帮助

Thank you 谢谢

The other options is to make the global arrays cm,um,vm , and possibly also the smaller other ones, allocatable. 其他选项是使全局数组cm,um,vm ,也可能使较小的其他数组可分配。 This will become handy when you change the problem size, read it from somewhere and maintain the executable. 当您更改问题大小,从某个地方读取问题并维护可执行文件时,这将变得很方便。

REAL(DP), DIMENSION(:,:),allocatable :: cm,um,vm

allocate(cm(nk,nk),um(nk,nk),vm(nk,nk))

It is a stack space issue. 这是一个堆栈空间问题。 I tried running it with ifort and even without openmp I get illegal instruction and I had to specify -heap-arrays in order to get it to run properly. 我尝试使用ifort运行它,即使没有openmp,我也会收到非法指令,并且必须指定-heap-arrays才能使其正常运行。 Once I added openmp the illegal instruction error came back. 添加openmp后,非法指令错误又回来了。 The WHERE statement seems to be the problem code. WHERE语句似乎是问题代码。 In both the openmp and non-openmp runs that is the part that causes it to fail 在openmp和non-openmp运行中,都是导致失败的部分

OS X stack space is rather limited and you are creating large arrays. OS X堆栈空间相当有限,您正在创建大型阵列。 Using -heap-arrays helps, but once you use openmp that is no longer a possibility and ulimit is maxed out as ~64 MB. 使用-heap-arrays会有所帮助,但是一旦使用openmp,就不再有可能,并且ulimit的最大值约为64 MB。

I found adding this to your compilation works: 我发现将其添加到您的编译作品中:

-Wl,-stack_size,0x40000000,-stack_addr,0xf0000000

Which increases the stack size to 1GB. 这会将堆栈大小增加到1GB。 This could probably be fine tuned, but I tried using 256 MB and it was still not enough. 可能对此进行了微调,但是我尝试使用256 MB,但仍然不够用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM