简体   繁体   中英

C++ classes with dynamic allocation in cuda?

I have a basic doubt on porting C++ classes to CUDA and I can not find a direct, clear answer about what it seems to be a pain in the end.

I think one would agree that C++ code for the host will very often use new/delete operators in the constructor and destructor. Thinking about easily porting C++ code to CUDA, there are few postings claiming that it is 'easy', or say easier and easier, and the main reason given are examples with __host__ __device__ decorators. It is not difficult to find out in some postings that dynamical allocation in the device usually implies a serious penalty in performance. So, what is one supposed to do with the C++ classes in CUDA?

Adding decorators is not going to change the dynamical allocation of memory that happens in the core of the constructors and destructors. It seems one does need to rewrite the C++ classes without new/delete. In my experience it was really impressive how bad does a new/delete class behave compared with a static allocation, for obvious reasons, but it is really bad, like going to a processor 20 years old ... So, what do people who have ported C++ applications with dynamical allocation do? (for more than very few doubles in an array that can be counted with the hands)

The standard approach is to change the scope and life cycle of objects within the code so that it isn't necessary to continuously create and destroy objects as part of computations on the device. Memory allocation in most distributed memory architectures (CUDA, HPC clusters, etc) is expensive, and the usual solution is to use it as sparingly as possible and amortise the cost of the operation by extending the lifetime of objects.

Ideally, create all the objects you need at the beginning of the programming, even if that means pre-allocating a pool of objects which will be consumed as the program runs. That is more efficient that ad-hoc memory allocation and deallocation. It also avoids problems with memory fragmentation, which can get to be an issue on GPU hardware where pages sizes are rather large.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM