[英]Why does cuda-gdb launch multiple threads?
当我在cuda-gdb中启动程序时,输出如下:
[New Thread 0x7fffef8ea700 (LWP 8003)]
[New Thread 0x7fffe35b2700 (LWP 8010)]
[New Thread 0x7fffe2db1700 (LWP 8011)]
[New Thread 0x7fffe25b0700 (LWP 8012)]
我不明白为什么一开始会启动这些多个线程。 我尚未以多线程模式启动程序。 我正在使用MPI,但是我开始了一个过程。 那么,这些线程从哪里来?
这丝毫不影响我的调试过程。 只是我不明白这意味着什么。
您看到的这些线程是由CUDA运行时库创建的,与cuda-gdb
本身没有直接关系。 如果使用gdb
运行相同的代码,您还将看到相同的消息。
如果您想了解这些线程在做什么或它们来自何处,会发生什么,只需使用-g
标志编译代码,在代码中设置一个断点(例如,紧接在CUDA内核启动之前),运行它,然后在gdb
控制台中运行以下命令:
thread apply all backtrace
该命令与gdb的backtrace
具有相同的作用,除了它将显示程序创建的所有线程的backtrace。
就我而言,启动程序后,我收到以下消息:
[New Thread 0x7fffeffb3700 (LWP 7141)]
[New Thread 0x7fffef731700 (LWP 7142)]
[New Thread 0x7fffeef30700 (LWP 7143)]
在gdb
控制台中运行上述命令时,我看到以下输出:
(gdb) thread apply all backtrace
Thread 4 (Thread 0x7fffeef30700 (LWP 7143)):
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1 0x00007ffff63c19b7 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007ffff6386bb7 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007ffff63c0f48 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007ffff79bf064 in start_thread (arg=0x7fffeef30700) at pthread_create.c:309
#5 0x00007ffff6cce62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Thread 3 (Thread 0x7fffef731700 (LWP 7142)):
#0 0x00007ffff6cc5aed in poll () at ../sysdeps/unix/syscall-template.S:81
#1 0x00007ffff63bf6a3 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007ffff642261e in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007ffff63c0f48 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007ffff79bf064 in start_thread (arg=0x7fffef731700) at pthread_create.c:309
#5 0x00007ffff6cce62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Thread 2 (Thread 0x7fffeffb3700 (LWP 7141)):
#0 0x00007ffff6ccfa9f in accept4 (fd=13, addr=..., addr_len=0x7fffeffb2e18, flags=-1) at ../sysdeps/unix/sysv/linux/accept4.c:45
#1 0x00007ffff63c0556 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007ffff63b404d in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007ffff63c0f48 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007ffff79bf064 in start_thread (arg=0x7fffeffb3700) at pthread_create.c:309
#5 0x00007ffff6cce62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Thread 1 (Thread 0x7ffff7fc0740 (LWP 7136)):
#0 main () at cuda_heap.cu:66
正如您可以验证的,在开始时创建的所有线程都与线程地址和LWP(轻量级进程)ID匹配。 您可以看到它们全部来自libcuda.so.1。
在cuda-gdb
,您可以看到一些更详细的信息:
(cuda-gdb) thread apply all bt
Thread 4 (Thread 0x7fffeef30700 (LWP 10019)):
#0 0x00007ffff79c33f8 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007ffff63c19b7 in cudbgApiDetach () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007ffff6386bb7 in cudbgApiDetach () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007ffff63c0f48 in cudbgApiDetach () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007ffff79bf064 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5 0x00007ffff6cce62d in clone () from /lib/x86_64-linux-gnu/libc.so.6
Thread 3 (Thread 0x7fffef731700 (LWP 10018)):
#0 0x00007ffff6cc5aed in poll () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff63bf6a3 in cudbgApiDetach () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007ffff642261e in cuVDPAUCtxCreate () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007ffff63c0f48 in cudbgApiDetach () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007ffff79bf064 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5 0x00007ffff6cce62d in clone () from /lib/x86_64-linux-gnu/libc.so.6
Thread 2 (Thread 0x7fffeffb3700 (LWP 10017)):
#0 0x00007ffff6ccfa9f in accept4 () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff63c0556 in cudbgApiDetach () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007ffff63b404d in cudbgApiDetach () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007ffff63c0f48 in cudbgApiDetach () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007ffff79bf064 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5 0x00007ffff6cce62d in clone () from /lib/x86_64-linux-gnu/libc.so.6
Thread 1 (Thread 0x7ffff7fc0740 (LWP 10007)):
#0 main () at cuda_heap.cu:66
我不知道它到底是什么,但是我认为cuda-gdb需要创建多个线程来捕获错误/异常,例如:内存冲突或库冲突。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.