简体   繁体   English

使用MPI进行Valgrind + gdb调试,库错误?

[英]Valgrind+gdb debugging with MPI, error in library?

I am having problem with gdb+valgrind debugging. 我在gdb + valgrind调试时遇到问题。 I run valgrind with vgdb option and then in another session gdb with target remote command. 我使用vgdb选项运行valgrind,然后使用目标远程命令在另一个会话gdb中运行。 However, it seems that there are the errors at the beginning with initialization MPI. 但是,似乎初始化MPI开头存在错误。 I get these types of errors: 我收到以下类型的错误:

warning: cannot close "/usr/lib64/openmpi/lib/openmpi/mca_btl_ofud.so": Invalid operation <br/>
warning: cannot close "/lib64/libosmcomp.so.3": Invalid operation <br/>
warning: cannot close "/lib64/librdmacm.so.1": Invalid operation <br/>
warning: cannot close "/lib64/libibverbs.so.1": Invalid operation <br/>
warning: cannot close "/lib64/libibumad.so.3": Invalid operation <br/>
warning: cannot close "/usr/lib64/openmpi/lib/openmpi/mca_btl_openib.so": Invalid operation <br/>
warning: cannot close "/usr/lib64/openmpi/lib/openmpi/mca_pml_bfo.so": Invalid operation <br/>
warning: cannot close "/usr/lib64/openmpi/lib/openmpi/mca_pml_csum.so": Invalid operation <br/>
warning: cannot close "/usr/lib64/openmpi/lib/openmpi/mca_pml_v.so": Invalid operation

Then I get error: 然后我得到错误:

Program received signal SIGTRAP, Trace/breakpoint trap.
0x0000000007950277 in __libc_writev (fd=7, vector=0x9a40f90, count=3) at ../sysdeps/unix/sysv/linux/writev.c:50
c50         
result = INLINE_SYSCALL (writev, 3, fd, CHECK_N (vector, count), count); 

The problem is that after I press continue, on the screen I get message "Continuing.", but it seems that program is not executing any more. 问题是,当我按继续后,在屏幕上会显示消息“继续。”,但似乎程序不再执行。 Before I got these errors in MPI library (PMPI_Init (in /usr/lib64/openmpi/lib/libmpi.so.1.0.6) ), which are reported by valgrind, I couldn't inspect error with gdb, I would constantly get: 在valgrind报告的MPI库(PMPI_Init(在/usr/lib64/openmpi/lib/libmpi.so.1.0.6)中)出现这些错误之前,我无法检查gdb的错误,我会不断得到:

Cannot access memory at address 0x39 
Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.5.8-1.fc18.x86_64 krb5-libs-1.10.3-17.fc18.x86_64 libcom_err-1.42.5-1.fc18.x86_64 libesmtp-1.0.6-4.fc18.x86_64 libselinux-2.1.12-7.3.fc18.x86_64 openssl-libs-1.0.1e-37.fc18.x86_64 pcre-8.31-5.fc18.x86_64

It seems that there is an error in mpi library, but since I am not proficient user of gdb, I am not 100% sure. 似乎mpi库中有一个错误,但是由于我不是gdb的精通用户,所以我不是100%确定。 Is there any suggestion what might be wrong? 有什么建议可能有什么问题吗?
Thanks in advance! 提前致谢!

First of all why are you trying to use gdb and valgrind together? 首先,为什么要尝试同时使用gdb和valgrind? Find you bug using gdb, then find your memory leaks using valgrind after you've fixed your bugs. 修复错误后,请使用gdb查找错误,然后使用valgrind查找您的内存泄漏。

Regarding GDB and signals. 关于GDB和信号。 GDB will catch all signals before they get to your application. GDB会在到达您的应用程序之前捕获所有信号。

So If your application should not be receiving signals then you'd need to figure out why it is receiving one. 因此,如果您的应用程序不应该接收信号,那么您需要弄清楚为什么它接收信号。

However you can ask gdb to ignore all signals, like so: 但是,您可以要求gdb忽略所有信号,如下所示:

gdb -p $prodid -x $file

>> cat file
>> handle SIGUSR1 nostop
   continue 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM