簡體   English   中英

分析在Kubernetes Linux中運行的dotnet核心進程的內存轉儲

[英]Analyze memory dump of a dotnet core process running in Kubernetes Linux

我正在Google Cloud(GKE)中使用Kubernetes。

我有一個存儲內存的應用程序,我需要按此處指示的那樣進行進程轉儲 當它到達512Mb RAM時,Kubernetes將殺死該Pod。

所以我連接到吊艙

# kubectl exec -it stuff-7d8c5598ff-2kchk /bin/bash

並運行:

# apt-get update && apt-get install procps && apt-get install gdb

找到我想要的過程:

root@stuff-7d8c5598ff-2kchk:/app# ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  4.6  2.8 5318004 440268 ?      SLsl Oct11 532:18 dotnet stuff.Web.dll
root      114576  0.0  0.0  18212  3192 ?        Ss   17:23   0:00 /bin/bash
root      114583  0.0  0.0  36640  2844 ?        R+   17:23   0:00 ps aux

但是當我嘗試轉儲...

root@stuff-7d8c5598ff-2kchk:/app# gcore 1
ptrace: Operation not permitted.
You can't do that without a process to debug.
The program is not being run.
gcore: failed to create core.1

我嘗試了幾種類似的解決方案 ,但總是以相同的結果結尾:

root@stuff-7d8c5598ff-2kchk:/app# echo 0 > proc/sys/kernel/yama/ptrace_scope
bash: /proc/sys/kernel/yama/ptrace_scope: Read-only file system

我找不到連接到pod並處理此ptrace東西的方法。 我發現--privileged具有--privileged開關,但找不到與kubectl類似的東西。

更新我發現了如何啟用PTRACE

apiVersion: v1
kind: Pod
metadata:
  name: <your-pod>
spec:
  shareProcessNamespace: true
  containers:
  - name: containerB
    image: <your-debugger-image>
    securityContext:
      capabilities:
        add:
        - SYS_PTRACE

獲取進程轉儲:

root@stuff-6cd8848797-klrwr:/app# gcore 1
[New LWP 9]
[New LWP 10]
[New LWP 13]
[New LWP 14]
[New LWP 15]
[New LWP 16]
[New LWP 17]
[New LWP 18]
[New LWP 19]
[New LWP 20]
[New LWP 22]
[New LWP 24]
[New LWP 25]
[New LWP 27]
[New LWP 74]
[New LWP 100]
[New LWP 753]
[New LWP 756]
[New LWP 765]
[New LWP 772]
[New LWP 814]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
185     ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S: No such file or directory.
warning: target file /proc/1/cmdline contained unexpected null characters
Saved corefile core.1

有趣的是,我找不到lldb-3.6,因此我安裝了lldb-3.8:

root@stuff-6cd8848797-klrwr:/app# apt-get update && apt-get install lldb-3
.6
Hit:1 http://security.debian.org/debian-security stretch/updates InRelease
Ign:2 http://cdn-fastly.deb.debian.org/debian stretch InRelease
Hit:3 http://cdn-fastly.deb.debian.org/debian stretch-updates InRelease
Hit:4 http://cdn-fastly.deb.debian.org/debian stretch Release
Reading package lists... Done
Reading package lists... Done
Building dependency tree
Reading state information... Done
Note, selecting 'python-lldb-3.6' for regex 'lldb-3.6'
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

查找SOS插件:

root@stuff-6cd8848797-klrwr:/app# find /usr -name libsosplugin.so
/usr/share/dotnet/shared/Microsoft.NETCore.App/2.1.5/libsosplugin.so

運行lldb ...

root@stuff-6cd8848797-klrwr:/app# lldb `which dotnet` -c core.1
(lldb) target create "/usr/bin/dotnet" --core "core.1"

但是它永遠存在,提示永遠不會再到達(lldb) ...

我有類似的問題。 嘗試安裝正確版本的LLDB。 來自特定dotnet版本的SOS插件鏈接到特定版本的LLDB。 例如,dotnet 2.0.5與LLDB 3.6鏈接,v.2.1.5與LLDB 3.9鏈接。 另外,該文檔可能對您有所幫助: 調試CoreCLR

請注意,並非所有版本的LLDB都可用於某些OS。 例如,LLDB 3.6在Debian上不可用,但在Ubuntu上可用。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM