简体   繁体   English

如何监控 linux 上的进程

[英]How to monitor processes on linux

When an executable is running on Linux, it generates processes, threads, I/O... etc, and uses libraries from languages like C/C++, sometimes there might be timers in question, is it possible to monitor this?当可执行文件在 Linux 上运行时,它会生成进程、线程、I/O... 等,并使用来自 C/C++ 等语言的库,有时可能会出现计时器问题,是否可以对此进行监控? how can I get a deep dive into these software and processes and what is going on in the background?如何深入了解这些软件和流程以及后台发生的情况?

I know this stuff is abstracted from me because I shouldn't be worrying about it as a regular user, but I'm curious to what would I see.我知道这些东西是从我身上抽象出来的,因为作为普通用户,我不应该担心它,但我很好奇我会看到什么。

What I need to see are:我需要看到的是:

  1. System calls for this process/thread.此进程/线程的系统调用。
  2. Open/closed sockets.打开/关闭 sockets。
  3. Memory management and utilization, what block is being accessed. Memory 管理和利用,正在访问什么块。
  4. Memory instructions. Memory 指令。
  5. If a process is depending on the results of another one.如果一个过程取决于另一个过程的结果。
  6. If a process/thread terminates, why, and was it successful?如果一个进程/线程终止,为什么,它是否成功?
  7. I/O operations and DB read/write if any. I/O 操作和 DB 读/写(如果有)。

How about top ? 呢?

On that page are more than one linux/Unix command, which might come in handy.在那个页面上有不止一个 linux/Unix 命令,它们可能会派上用场。

The different things you wanted to monitor may require different tools.您想要监控的不同事物可能需要不同的工具。 All tools I will mention below have extensive manual pages where you can find exactly how to use them.我将在下面提到的所有工具都有大量的手册页,您可以在其中准确找到如何使用它们。

System calls for this process/thread.此进程/线程的系统调用。

The strace command does exactly this - it lists exactly which system calls are invoked by your program. strace命令正是这样做的——它准确地列出了您的程序调用了哪些系统调用。 The ltrace tool is similar, but focuses on calls to library functions - not just system calls (which involve the kernel). ltrace工具类似,但侧重于对库函数的调用——不仅仅是系统调用(涉及内核)。

Open/closed sockets.打开/关闭 sockets。

The strace/ltrace commands will list among other things socket creation, but if you want to know which sockets are open - connected, listening, and so on - right now, there is the netstat utility, which lists all the connected (or with "-a", also listening) sockets in the system, and which process they belong to. strace/ltrace 命令将列出套接字创建等内容,但如果您想知道哪些 sockets 已打开 - 已连接、正在侦听等 - 现在,有netstat实用程序,它列出了所有已连接的(或带有“ -a",也在监听)系统中的sockets,以及它们属于哪个进程。

Memory management and utilization, what block is being accessed. Memory 管理和利用,正在访问什么块。 Memory instructions. Memory 指令。

Again ltrace will let you see all malloc()/free() calls, but to see exactly what memory is being access where, you'll need a debugger, like gdb .再次 ltrace 将让您查看所有 malloc()/free() 调用,但要确切了解 memory 正在访问的位置,您需要一个调试器,例如gdb The thing is that almost everything your program does will be a "memory instruction" so you'll need to know exactly what you are looking for, with breakpoints, tracepoints, single-stepping, and so on, and usually don't just want to see every memory access in your program.问题是您的程序所做的几乎所有事情都是“内存指令”,因此您需要确切地知道您在寻找什么,包括断点、跟踪点、单步执行等等,而且通常不只是想要查看程序中的每个 memory 访问。

If you don't want to find all memory accesses but rather are searching for bugs in this area - like accessing memory after it's freed and so on, there are tools that help you find those more easily.如果您不想找到所有 memory 访问,而是要搜索该区域中的错误- 例如在 memory 被释放后访问等等,有一些工具可以帮助您更轻松地找到这些错误。 One of them called ASAN ("Address Sanitizer") is built into the C++ compiler, so you can build with it enabled and get messages on bad access patterns.其中一个称为ASAN (“Address Sanitizer”)内置于 C++ 编译器中,因此您可以启用它进行构建并获取有关错误访问模式的消息。 Another one you can use is valgrind .您可以使用的另一个是valgrind

Finally, if by "memory utilization" you meant to just check how much memory your process or thread is using, well, both ps and top can tell you that.最后,如果通过“内存利用率”来检查您的进程或线程正在使用多少 memory,那么pstop都可以告诉您。

If a process is depending on the results of another one.如果一个过程取决于另一个过程的结果。 If a process/thread terminates, why, and was it successful?如果一个进程/线程终止,为什么,它是否成功?

Various tools I mentioned like strace/ltrace will let you know when the process they follow exits.我提到的各种工具(例如 strace/ltrace)会在它们遵循的进程退出时通知您。 Any process can print the exit code of one of its sub-processes, but I'm not aware of a tool which can print the exit status of all processes in the system.任何进程都可以打印其子进程之一的退出代码,但我不知道有一种工具可以打印系统中所有进程的退出状态。

I/O operations输入输出操作

There is iostat that can give you periodic summaries of how much IO was done to each disk.iostat可以定期总结每个磁盘执行了多少 IO。 netstat -s gives you network statistics so you can see how many network operations were done. netstat -s为您提供网络统计信息,以便您查看完成了多少网络操作。 vmstat gives you, among other things, statistics on IO caused by swap in/out (in case this is a problem in your case). vmstat为您提供由换入/换出引起的 IO 的统计信息(以防您遇到问题)。

and DB read/write if any.和数据库读/写(如果有)。

This depends on your DB, I guess, and how you monitor it.这取决于你的数据库,我猜,以及你如何监控它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM