简体   繁体   中英

Can nvlink inline device functions from separate compilation units?

If the separate compilation units that are fed as input to nvlink contain cuda kernels and device functions that invoke device functions marked as __forceinline__ , will these functions be inlined? Assume they would be inlined if one put all the source code into a single file.

If the separate compilation units that are fed as input to nvlink contain cuda kernels and device functions that invoke device functions marked as __forceinline__ , will these functions be inlined?

To the best of my knowledge, the CUDA device code linker can't do this. The __forceinline__ directive is a compiler level operation, and after compilation there is no way of marking code as inlineable in either PTX or SASS. The CUDA device code compiler should emit a warning that an external inline function was used but not defined if you try this.

If you want functions to be compiled inline, you have to (unsurprisingly) use a compiler, not a linker.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM