如何使用CUDA Fortran在结构中分配数组的数组？

Question

With CUDA, I'm trying to allocate arrays in a structure, but I'm having an issue and I don't know why. 使用CUDA，我试图在结构中分配数组，但是遇到问题，我不知道为什么。 So here is a short code (stored in a file called struct.cuf ) that describe my problem. 因此，这是描述我的问题的简短代码（存储在名为struct.cuf的文件中）。 I'm compiling with the PGI 16.10 version , and I'm using the following options : -O3 -Mcuda=cc60 -tp=x64 struct.cuf -o struct_out 我正在使用PGI 16.10 version编译，并且正在使用以下选项： -O3 -Mcuda=cc60 -tp=x64 struct.cuf -o struct_out

module structure
contains

type mytype
 integer :: alpha,beta,gamma
 real,dimension(:),pointer :: a
end type mytype

type mytypeDevice
 integer :: alpha,beta,gamma
 real,dimension(:),pointer,device :: a
end type mytypeDevice

end module structure

program main
 use cudafor
 use structure

 type(mytype) :: T(3)
 type(mytypeDevice),device :: T_Device(3)

 ! For the host
 do i=1,3
  allocate(T(i)%a(10))
 end do
 T(1)%a=1; T(2)%a=2; T(3)%a=3

 ! For the device
 print *, 'Everything from now is ok'
 do i=1,3
  allocate(T_Device(i)%a(10))
 end do
 !do i=1,3
 ! T_Device(i)%a=T(i)%a
 !end do

end program main

The output error : 输出错误：

 Everything from now is ok
Segmentation fault

What I am doing wrong here ? 我在这里做错了什么？

The only solution I found (and working) is to stored the values in differents arrays and transfers them to the GPU, but it's very "Heavy". 我发现（并且正在工作）的唯一解决方案是将值存储在differents数组中并将它们传输到GPU，但这非常“繁重”。 Mostly if I use a lot of structures like mytype. 通常，如果我使用很多结构，例如mytype。

EDIT : Code has been modified to use Vladimir F's solution. 编辑：代码已被修改为使用Vladimir F的解决方案。 If I remove the device attribute from T_Device(3) declaration, then allocation seems ok and giving values too (commented lines below allocation). 如果我从T_Device(3)声明中删除了device属性，则分配似乎还可以，并且也提供了值（分配下方的注释行）。 But I need that device attribute for T_Device(3) , because I'm gonna use it in kernels. 但是我需要T_Device(3) device属性，因为我将在内核中使用它。

Thanks ! 谢谢！

Answer 1

I think you need a device pointer 我认为您需要一个设备指针

type mytype_device
 ...
 real,dimension(:),pointer, device :: a
end type

Never used CUDA Fortran in my life, but it seems obvious enough to wager. 我一生中从未使用过CUDA Fortran，但似乎可以下注。

Answer 2

The problem here is how you have declared T_Device . 这里的问题是您如何声明T_Device 。 To use host side allocation you first populate a host memory copy of the device structure, and then copy it to device memory. 要使用主机端分配，您首先要填充设备结构的主机内存副本，然后将其复制到设备内存中。 This: 这个：

type(mytypeDevice) :: T_Device(3)

do i=1,3
  allocate(T_Device(i)%a(10))
 end do

will work correctly. 将正常工作。 This is a very standard design pattern in C++ based CUDA code, and the principle here is identical. 这是基于C ++的CUDA代码中非常标准的设计模式，此处的原理相同。

如何使用CUDA Fortran在结构中分配数组的数组？

问题描述

2 个解决方案

解决方案1
1 2017-06-21 15:36:58

解决方案2
1

如何使用CUDA Fortran在结构中分配数组的数组？

问题描述

2 个解决方案

解决方案1 1 2017-06-21 15:36:58

解决方案2 1

解决方案1
1 2017-06-21 15:36:58

解决方案2
1