繁体   English   中英

使用gdb backtrace调试MPI代码

[英]using gdb backtrace to debug MPI code

结合使用gdb和backtrace可以得到以下输出,

[Thread debugging using libthread_db enabled]
[New Thread 0x2aaaaffd3700 (LWP 32109)]
[Thread 0x2aaaaffd3700 (LWP 32109) exited]
Detaching after fork from child process 32110.
Detaching after fork from child process 32111.
Detaching after fork from child process 32112.
Detaching after fork from child process 32113.
Detaching after fork from child process 32114.
Detaching after fork from child process 32115.
Detaching after fork from child process 32116.
Detaching after fork from child process 32117.
Detaching after fork from child process 32118.
Detaching after fork from child process 32119.
Detaching after fork from child process 32120.
Detaching after fork from child process 32121.
Detaching after fork from child process 32122.
Detaching after fork from child process 32123.
Detaching after fork from child process 32124.
Detaching after fork from child process 32125.
Detaching after fork from child process 32126.
Detaching after fork from child process 32127.
Detaching after fork from child process 32128.
Detaching after fork from child process 32129.
Detaching after fork from child process 32130.
Missing separate debuginfos, use: debuginfo-install     fftw-3.2.1-3.1.el6.x86_64 glibc-2.12-1.80.el6_3.5.x86_64 nss-pam-ldapd-0.7.5-14.el6_2.1.x86_64
Detaching after fork from child process 32131.
Detaching after fork from child process 32133.
Detaching after fork from child process 32134.
Detaching after fork from child process 32135.
Detaching after fork from child process 32136.
Detaching after fork from child process 32137.
Detaching after fork from child process 32138.
Detaching after fork from child process 32139.
Detaching after fork from child process 32140.
Detaching after fork from child process 32141.
Detaching after fork from child process 32142.
Detaching after fork from child process 32143.
Detaching after fork from child process 32144.

程序收到信号SIGFPE,算术异常。

0x00000000004a3104 in phase::Mobility::Average ()
#0  0x00000000004a3104 in phase::Mobility::Average ()
#1  0x00000000004a3523 in phase::Mobility::Average(phase::Field&, phase::BoundaryConditions&) ()
#2  0x000000000046fcda in phase::Diffusion::CalculateMobility(phase::Field&, phase::Composition&, phase::BoundaryConditions&, phase::Mobility&) ()
#3  0x0000000000441a3e in MyParallelism<MyParallelBlock>::Run() ()
#4  0x00000000004436dc in main ()

输出功能的顺序指示什么? 我应该在寻找输出的最后一个功能吗? 如何进一步缩小导致算术异常的行?

编辑使用-g选项运行时,

Program received signal SIGFPE, Arithmetic exception.
0x00000000004a5fa4 in phase::Mobility::Average ()
#0  0x00000000004a5fa4 in phase::Mobility::Average ()
#1  0x00000000004a63c3 in phase::Mobility::Average(phase::Field&, phase::BoundaryConditions&) ()
#2  0x0000000000472fea in phase::Diffusion::Mobility(phase::Field&, phase::Composition&, phase::BoundaryConditions&, phase::Mobility&) ()
#3  0x000000000042686e in MyParallelBlock::DoTimestep (this=0x7c9368)
    at Parallelism.cpp:100
#4  0x00000000004450d9 in MyParallelism<MyParallelBlock>::Run (
    this=0x7fffffffd2f0) at Parallelism.cpp:164
#5  0x0000000000446ad3 in main (argc=1, argv=0x7fffffffdcd8)
    at Parallelism.cpp:242

但是并没有缩小算术异常的原因。 这增加了异常在运行循环中的信息(已知)。 我期望在功能phase::Mobility::Average ()有更多信息phase::Mobility::Average () 数字0x0000000000446ad3, 0x00000000004450d9等的含义是什么? 我可以从这些数字中获取一些信息吗?

gdb堆栈跟踪以从上到下的顺序(位于堆栈从下到上的顺序)显示了函数在调用堆栈上的顺序。

如果gdb捕获到算术异常分段错误 ,则引起错误的函数将显示在gdb堆栈跟踪的位置#0

为了获取错误发生位置的文件和行信息,请使用调试符号重新编译程序。 使用编译器-g标志来执行此操作。 确保至少重新编译那些声明和实现了失败函数(请参见堆栈跟踪中的#0 )的文件。

在您的情况下,您必须使用-g选项重新编译实现类/命名空间phase::Mobility的文件。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM