簡體   English   中英

使用gdb backtrace調試MPI代碼

[英]using gdb backtrace to debug MPI code

結合使用gdb和backtrace可以得到以下輸出,

[Thread debugging using libthread_db enabled]
[New Thread 0x2aaaaffd3700 (LWP 32109)]
[Thread 0x2aaaaffd3700 (LWP 32109) exited]
Detaching after fork from child process 32110.
Detaching after fork from child process 32111.
Detaching after fork from child process 32112.
Detaching after fork from child process 32113.
Detaching after fork from child process 32114.
Detaching after fork from child process 32115.
Detaching after fork from child process 32116.
Detaching after fork from child process 32117.
Detaching after fork from child process 32118.
Detaching after fork from child process 32119.
Detaching after fork from child process 32120.
Detaching after fork from child process 32121.
Detaching after fork from child process 32122.
Detaching after fork from child process 32123.
Detaching after fork from child process 32124.
Detaching after fork from child process 32125.
Detaching after fork from child process 32126.
Detaching after fork from child process 32127.
Detaching after fork from child process 32128.
Detaching after fork from child process 32129.
Detaching after fork from child process 32130.
Missing separate debuginfos, use: debuginfo-install     fftw-3.2.1-3.1.el6.x86_64 glibc-2.12-1.80.el6_3.5.x86_64 nss-pam-ldapd-0.7.5-14.el6_2.1.x86_64
Detaching after fork from child process 32131.
Detaching after fork from child process 32133.
Detaching after fork from child process 32134.
Detaching after fork from child process 32135.
Detaching after fork from child process 32136.
Detaching after fork from child process 32137.
Detaching after fork from child process 32138.
Detaching after fork from child process 32139.
Detaching after fork from child process 32140.
Detaching after fork from child process 32141.
Detaching after fork from child process 32142.
Detaching after fork from child process 32143.
Detaching after fork from child process 32144.

程序收到信號SIGFPE,算術異常。

0x00000000004a3104 in phase::Mobility::Average ()
#0  0x00000000004a3104 in phase::Mobility::Average ()
#1  0x00000000004a3523 in phase::Mobility::Average(phase::Field&, phase::BoundaryConditions&) ()
#2  0x000000000046fcda in phase::Diffusion::CalculateMobility(phase::Field&, phase::Composition&, phase::BoundaryConditions&, phase::Mobility&) ()
#3  0x0000000000441a3e in MyParallelism<MyParallelBlock>::Run() ()
#4  0x00000000004436dc in main ()

輸出功能的順序指示什么? 我應該在尋找輸出的最后一個功能嗎? 如何進一步縮小導致算術異常的行?

編輯使用-g選項運行時,

Program received signal SIGFPE, Arithmetic exception.
0x00000000004a5fa4 in phase::Mobility::Average ()
#0  0x00000000004a5fa4 in phase::Mobility::Average ()
#1  0x00000000004a63c3 in phase::Mobility::Average(phase::Field&, phase::BoundaryConditions&) ()
#2  0x0000000000472fea in phase::Diffusion::Mobility(phase::Field&, phase::Composition&, phase::BoundaryConditions&, phase::Mobility&) ()
#3  0x000000000042686e in MyParallelBlock::DoTimestep (this=0x7c9368)
    at Parallelism.cpp:100
#4  0x00000000004450d9 in MyParallelism<MyParallelBlock>::Run (
    this=0x7fffffffd2f0) at Parallelism.cpp:164
#5  0x0000000000446ad3 in main (argc=1, argv=0x7fffffffdcd8)
    at Parallelism.cpp:242

但是並沒有縮小算術異常的原因。 這增加了異常在運行循環中的信息(已知)。 我期望在功能phase::Mobility::Average ()有更多信息phase::Mobility::Average () 數字0x0000000000446ad3, 0x00000000004450d9等的含義是什么? 我可以從這些數字中獲取一些信息嗎?

gdb堆棧跟蹤以從上到下的順序(位於堆棧從下到上的順序)顯示了函數在調用堆棧上的順序。

如果gdb捕獲到算術異常分段錯誤 ,則引起錯誤的函數將顯示在gdb堆棧跟蹤的位置#0

為了獲取錯誤發生位置的文件和行信息,請使用調試符號重新編譯程序。 使用編譯器-g標志來執行此操作。 確保至少重新編譯那些聲明和實現了失敗函數(請參見堆棧跟蹤中的#0 )的文件。

在您的情況下,您必須使用-g選項重新編譯實現類/命名空間phase::Mobility的文件。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM