[英]How to monitor processes on linux
When an executable is running on Linux, it generates processes, threads, I/O... etc, and uses libraries from languages like C/C++, sometimes there might be timers in question, is it possible to monitor this?当可执行文件在 Linux 上运行时,它会生成进程、线程、I/O... 等,并使用来自 C/C++ 等语言的库,有时可能会出现计时器问题,是否可以对此进行监控? how can I get a deep dive into these software and processes and what is going on in the background?
如何深入了解这些软件和流程以及后台发生的情况?
I know this stuff is abstracted from me because I shouldn't be worrying about it as a regular user, but I'm curious to what would I see.我知道这些东西是从我身上抽象出来的,因为作为普通用户,我不应该担心它,但我很好奇我会看到什么。
What I need to see are:我需要看到的是:
The different things you wanted to monitor may require different tools.您想要监控的不同事物可能需要不同的工具。 All tools I will mention below have extensive manual pages where you can find exactly how to use them.
我将在下面提到的所有工具都有大量的手册页,您可以在其中准确找到如何使用它们。
System calls for this process/thread.
此进程/线程的系统调用。
The strace command does exactly this - it lists exactly which system calls are invoked by your program. strace命令正是这样做的——它准确地列出了您的程序调用了哪些系统调用。 The ltrace tool is similar, but focuses on calls to library functions - not just system calls (which involve the kernel).
ltrace工具类似,但侧重于对库函数的调用——不仅仅是系统调用(涉及内核)。
Open/closed sockets.
打开/关闭 sockets。
The strace/ltrace commands will list among other things socket creation, but if you want to know which sockets are open - connected, listening, and so on - right now, there is the netstat utility, which lists all the connected (or with "-a", also listening) sockets in the system, and which process they belong to. strace/ltrace 命令将列出套接字创建等内容,但如果您想知道哪些 sockets 已打开 - 已连接、正在侦听等 - 现在,有netstat实用程序,它列出了所有已连接的(或带有“ -a",也在监听)系统中的sockets,以及它们属于哪个进程。
Memory management and utilization, what block is being accessed.
Memory 管理和利用,正在访问什么块。 Memory instructions.
Memory 指令。
Again ltrace will let you see all malloc()/free() calls, but to see exactly what memory is being access where, you'll need a debugger, like gdb .再次 ltrace 将让您查看所有 malloc()/free() 调用,但要确切了解 memory 正在访问的位置,您需要一个调试器,例如gdb 。 The thing is that almost everything your program does will be a "memory instruction" so you'll need to know exactly what you are looking for, with breakpoints, tracepoints, single-stepping, and so on, and usually don't just want to see every memory access in your program.
问题是您的程序所做的几乎所有事情都是“内存指令”,因此您需要确切地知道您在寻找什么,包括断点、跟踪点、单步执行等等,而且通常不只是想要查看程序中的每个 memory 访问。
If you don't want to find all memory accesses but rather are searching for bugs in this area - like accessing memory after it's freed and so on, there are tools that help you find those more easily.如果您不想找到所有 memory 访问,而是要搜索该区域中的错误- 例如在 memory 被释放后访问等等,有一些工具可以帮助您更轻松地找到这些错误。 One of them called ASAN ("Address Sanitizer") is built into the C++ compiler, so you can build with it enabled and get messages on bad access patterns.
其中一个称为ASAN (“Address Sanitizer”)内置于 C++ 编译器中,因此您可以启用它进行构建并获取有关错误访问模式的消息。 Another one you can use is valgrind .
您可以使用的另一个是valgrind 。
Finally, if by "memory utilization" you meant to just check how much memory your process or thread is using, well, both ps and top can tell you that.最后,如果通过“内存利用率”来检查您的进程或线程正在使用多少 memory,那么ps和top都可以告诉您。
If a process is depending on the results of another one.
如果一个过程取决于另一个过程的结果。 If a process/thread terminates, why, and was it successful?
如果一个进程/线程终止,为什么,它是否成功?
Various tools I mentioned like strace/ltrace will let you know when the process they follow exits.我提到的各种工具(例如 strace/ltrace)会在它们遵循的进程退出时通知您。 Any process can print the exit code of one of its sub-processes, but I'm not aware of a tool which can print the exit status of all processes in the system.
任何进程都可以打印其子进程之一的退出代码,但我不知道有一种工具可以打印系统中所有进程的退出状态。
I/O operations
输入输出操作
There is iostat that can give you periodic summaries of how much IO was done to each disk.有iostat可以定期总结每个磁盘执行了多少 IO。 netstat -s gives you network statistics so you can see how many network operations were done.
netstat -s为您提供网络统计信息,以便您查看完成了多少网络操作。 vmstat gives you, among other things, statistics on IO caused by swap in/out (in case this is a problem in your case).
vmstat为您提供由换入/换出引起的 IO 的统计信息(以防您遇到问题)。
and DB read/write if any.
和数据库读/写(如果有)。
This depends on your DB, I guess, and how you monitor it.这取决于你的数据库,我猜,以及你如何监控它。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.