How do I compile a CUDA shared library that depends on c++ object files?

Question

I am writing a c++/CUDA library with multiple calls to kernels.

EDIT: I think the original post was a little long, so I have created a better example. Original post below.

Here is the project simplified to a minimal example. It will not compile, and gives the following error:

nvcc -Xcompiler -fPIC -x cu -c -dc -o myclass.o myclass.cpp
nvcc -Xcompiler -fPIC --lib myclass.o kernel.cu -o libhelpme.a -I.
ptxas fatal   : Unresolved extern function '_ZN7myclassC1Ei'
makefile:8: recipe for target 'lib' failed
make: *** [lib] Error 255

All documentation on this topic points towards compiling an executable or an object file; I want to do neither of these, rather a static library specifically. How do I do this?

The code:

makefile

program: class lib
    nvcc -o program main.cc -I. -L. -lhelpme

class:
    nvcc -Xcompiler -fPIC -x cu -c -dc -o myclass.o myclass.cpp

lib: class
    nvcc -Xcompiler -fPIC --lib myclass.o kernel.cu -o libhelpme.a -I.

clean:
    rm *.o *.a program

main.cc

#include "stdio.h"
#include <iostream>
#include "kernel.h"
int main()
{
    std::cout << "hello world" << std::endl;
    wrapper();
    return 0;
}

myclass.h

#ifdef __CUDACC__
#define COMMON __host__ __device__
#else
#define COMMON
#endif
#ifndef M
#define M
class myclass
{
    public:
        int x;
        COMMON myclass(int y);
        COMMON void increment();
};
#endif

myclass.cpp

#include "myclass.h"
#ifdef __CUDACC__
#define COMMON __host__ __device__
#else
#define COMMON
#endif
COMMON myclass::myclass(int y)
{
    x = y;
}
COMMON void myclass::increment()
{
    x += 1;
}

kernel.h

extern void wrapper();

kernel.cu

#include "stdio.h"
#include <iostream>
#include "myclass.h"

class myotherclass
{
    public:
        int x;
        COMMON myotherclass(int y) {x = y;}
        COMMON void decrement() {x -= 1;}
};

__global__ void dokernel()
{
    myotherclass p(8); //This compiles just fine.
    myclass q(7); //This will not compile
}

void wrapper()
{
    std::cout << "hello from wrapper\n";
    myclass q(1);
    myotherclass s(4);
    std::cout << "x = " << s.x << "\n";
    s.decrement();
    std::cout << "x = " << s.x << "\n";
    dokernel<<<1,1>>>();

}

I am slowly becoming convinced that this is impossible...

ORIGINAL POST: I have a number of c++ source/header files, eg vec.cpp and vec.h being compiled to object files, eg vec.o

Here is an example: vec.h

class vec
{
    public:
        realnum x,y,z;
        __host__ __device__ vec(float _x, float _y, float _z);
}

vec.cpp

__host__ __device__ vec::vec(float _x, float _y, float _z) {x = _x; y = _y; z = _z;}

Here is my full makefile (still in the making):

CC=nvcc
CFLAGS = -Wall -g -O3

HOME_DIR = $(shell pwd)

SRC_DIR := ${HOME_DIR}/../src
OBJ_DIR := ${HOME_DIR}/../lib
LIB_DIR := ${HOME_DIR}/../lib
KER_DIR := ${HOME_DIR}/../kernel
SRC_FILES := $(wildcard $(SRC_DIR)/*.cpp)
OBJ_FILES := $(patsubst $(SRC_DIR)/%.cpp,$(OBJ_DIR)/%.o,$(SRC_FILES))

LPROPS := -L${LIB_DIR} -lcuprops
LMAIN := -L${LIB_DIR} -lsharc
LRDR := -L${LIB_DIR} -lcurdr

INCL_PROPS := -I${SRC_DIR} -I${KER_DIR}

program: $(LIB_DIR)/libcurdr.so ${LIB_DIR}/libsharc.so $(LIB_DIR)/libcuprops.so $(OBJ_FILES)
    ${CC} -o $@ main.cc -I${SRC_DIR} ${LPROPS} ${LMAIN} ${LRDR}

${LIB_DIR}/libsharc.so: $(OBJ_FILES) $(LIB_DIR)/libcuprops.so
    ${CC} -Xcompiler -fPIC --shared  ${OBJ_FILES} -o $(LIB_DIR)/libsharc.so ${INCL_PROPS}

$(OBJ_DIR)/%.o: $(SRC_DIR)/%.cpp
    ${CC} -Xcompiler -fPIC -dc -o $@ $< ${INCL_PROPS}

$(LIB_DIR)/libcuprops.so:
    ${CC} -Xcompiler -fPIC --shared -o $(LIB_DIR)/libcuprops.so ${KER_DIR}/nvidia_properties.cu ${INCL_PROPS}

$(LIB_DIR)/libcurdr.so: $(OBJ_FILES)
    ${CC} -Xcompiler -fPIC --shared ${OBJ_FILES} ${KER_DIR}/gpu_rdr.cu -o $(LIB_DIR)/libcurdr.so ${INCL_PROPS}

clean:
    rm ${LIB_DIR}/*

When I make I get the following:

ptxas fatal   : Unresolved extern function '_ZN3vecC1Eddd'

I have a kernel where I try to initialize a vector:

__global__ void SOME_KERNEL()
{
    int row = blockIdx.y*blockDim.y + threadIdx.y;
    int col = blockIdx.x*blockDim.x + threadIdx.x;
    if (row < dev_height && col < dev_width)
    {
        vec t(0,0,0); //Compiles nicely when I comment out this line!
    }
}

I have read about separate compiling and linking where it is claimed that the typical project architecture (that I believe that I am using) is compatible with separate compiling and linking via the following:

objects = main.o particle.o v3.o

all: $(objects)
    nvcc -arch=sm_20 $(objects) -o app

%.o: %.cpp
    nvcc -x cu -arch=sm_20 -I. -dc $< -o $@

clean:
    rm -f *.o app

Note the use of the "-dc" flag, which is consistent with this answer.

At this point I have tried so many things that I am completely lost. So, how can I compile this project?

In the case that it is helpful, here is the full output from make:

nvcc -Xcompiler -fPIC -dc -o /home/wvn/dirs/projects/sharc/build/../lib/mat33.o /home/wvn/dirs/projects/sharc/build/../src/mat33.cpp -I/home/wvn/dirs/projects/sharc/build/../src -I/home/wvn/dirs/projects/sharc/build/../kernel
nvcc -Xcompiler -fPIC -dc -o /home/wvn/dirs/projects/sharc/build/../lib/vec.o /home/wvn/dirs/projects/sharc/build/../src/vec.cpp -I/home/wvn/dirs/projects/sharc/build/../src -I/home/wvn/dirs/projects/sharc/build/../kernel
nvcc -Xcompiler -fPIC -dc -o /home/wvn/dirs/projects/sharc/build/../lib/sharc.o /home/wvn/dirs/projects/sharc/build/../src/sharc.cpp -I/home/wvn/dirs/projects/sharc/build/../src -I/home/wvn/dirs/projects/sharc/build/../kernel
nvcc -Xcompiler -fPIC -dc -o /home/wvn/dirs/projects/sharc/build/../lib/boundingbox.o /home/wvn/dirs/projects/sharc/build/../src/boundingbox.cpp -I/home/wvn/dirs/projects/sharc/build/../src -I/home/wvn/dirs/projects/sharc/build/../kernel
nvcc -Xcompiler -fPIC --shared /home/wvn/dirs/projects/sharc/build/../lib/mat33.o /home/wvn/dirs/projects/sharc/build/../lib/vec.o /home/wvn/dirs/projects/sharc/build/../lib/sharc.o /home/wvn/dirs/projects/sharc/build/../lib/boundingbox.o /home/wvn/dirs/projects/sharc/build/../kernel/gpu_rdr.cu -o /home/wvn/dirs/projects/sharc/build/../lib/libcurdr.so -I/home/wvn/dirs/projects/sharc/build/../src -I/home/wvn/dirs/projects/sharc/build/../kernel
ptxas fatal   : Unresolved extern function '_ZN3vecC1Eddd'
makefile:32: recipe for target '/home/wvn/dirs/projects/sharc/build/../lib/libcurdr.so' failed
make: *** [/home/wvn/dirs/projects/sharc/build/../lib/libcurdr.so] Error 255

Answer 1

Referring to the UPDATED post, not the ORIGINAL post.

As indicated in the comments, one of the things needed was to add -x cu when you are compiling files that end in .cpp but contain CUDA constructs or device code. You've already added that in the proper place.

Your discussion around your COMMON macro has no bearing on this. They serve separate purposes. The macro is not a substitute for -x cu .

The other missing thing is that you need to instruct nvcc at the static library generation point that relocatable device code and device linking is needed. You do this by adding -rdc=true to the compile command line. Your Makefile could be modified like this:

lib: class
    nvcc -Xcompiler -fPIC -rdc=true --lib myclass.o kernel.cu -o libhelpme.a -I.

This is needed because you have device code in one compilation unit ( kernel.cu ) that is calling device code in another compilation unit ( myclass.cpp ).

With that change, your UPDATED post/project compiles without issue for me and also runs without error.

$ make clean
rm *.o *.a program
$ make
nvcc -Xcompiler -fPIC -x cu -c -dc -o myclass.o myclass.cpp
nvcc -Xcompiler -fPIC -rdc=true --lib myclass.o kernel.cu -o libhelpme.a -I.
nvcc -o program main.cc -I. -L. -lhelpme
$ cuda-memcheck ./program
========= CUDA-MEMCHECK
hello world
hello from wrapper
x = 4
x = 3
========= ERROR SUMMARY: 0 errors
$

How do I compile a CUDA shared library that depends on c++ object files?

Question

1 answers

solution1
2 ACCPTED 2019-12-27 03:54:49

How do I compile a CUDA shared library that depends on c++ object files?

Question

1 answers

solution1 2 ACCPTED 2019-12-27 03:54:49

solution1
2 ACCPTED 2019-12-27 03:54:49