如何在Linux中跟踪所有后代进程

Question

I am making a library that needs to spawn multiple processes. 我正在创建一个需要生成多个进程的库。

I want to be able to know the set of all descendant processes that were spawned during a test. 我希望能够知道在测试期间生成的所有后代进程的集合。 This is useful for terminating well-behaved daemons at the end of a passed test or for debugging deadlocks/hanging processes by getting the stack trace of any processes present after a failing test. 这对于在通过测试结束时终止行为良好的守护进程或通过获取失败测试后出现的任何进程的堆栈跟踪来调试死锁/挂起进程非常有用。

Since some of this requires spawning daemons (fork, fork, then let parent die), we cannot find all processes by iterating over the process tree. 由于其中一些需要产生守护进程（fork，fork，然后让父模具），我们无法通过遍历进程树来找到所有进程。

Currently my approach is: 目前我的方法是：

Register handler using os.register_at_fork 使用os.register_at_fork注册处理程序
On fork, in child, flock a file and append (pid, process start time) into another file 在fork上，在child中，flock一个文件并将(pid, process start time)追加到另一个文件中
Then when required, we can get the set of child processes by iterating over the entries in the file and keeping the ones where (pid, process start time) match an existing process 然后，当需要时，我们可以通过迭代文件中的条目并保持其中的位置（pid，进程开始时间）与现有进程匹配来获取子进程集。

The downsides of this approach are: 这种方法的缺点是：

Only works with multiprocessing or os.fork - does not work when spawning a new Python process using subprocess or a non-Python process. 仅适用于multiprocessing或os.fork - 在使用subprocess进程或非Python进程生成新的Python进程时不起作用。
Locking around the fork may make things more deterministic during tests than they will be in reality, hiding race conditions. 锁定叉子可能会使测试期间的事情比实际情况更具确定性，隐藏竞争条件。

I am looking for a different way to track child processes that avoids these 2 downsides. 我正在寻找一种不同的方法来跟踪避免这两个缺点的子进程。

Alternatives I have considered: 我考虑的替代方案：

Using bcc to register probes of fork/clone - the problem with this is that it requires root, which I think would be kind of annoying for running tests from a contributor point-of-view. 使用bcc来注册fork / clone的探测器 - 问题在于它需要root，我认为从贡献者的角度来看运行测试会有点烦人。 Is there something similar that can be done as an unprivileged user just for the current process and descendants? 是否有类似的东西可以作为一个无特权的用户只为当前的进程和后代做？
Using strace (or ptrace) similar to above - the problem with this is the performance impact. 使用类似于上面的strace（或ptrace） - 这个问题就是性能影响。 Several of the tests are specifically benchmarking startup time and ptrace has a relatively large overhead. 其中一些测试专门针对启动时间进行基准测试，而ptrace则具有相对较大的开销。 Maybe it would be less so if only tracking fork and clone, but it still conflicts with the desire to get the stacks on test timeout. 如果只跟踪fork和clone，它可能会更少，但它仍然与在测试超时时获得堆栈的愿望相冲突。

Can someone suggest an approach to this problem that avoids the pitfalls and downsides of the ones above? 有人可以提出一个方法来解决这个问题，避免上述问题的陷阱和缺点吗？ I am only interested in Linux right now, and ideally it shouldn't require a kernel later than 4.15. 我现在只对Linux感兴趣，理想情况下它不应该要求4.15之后的内核。

Answer 1

For subprocess.Popen , there's preexec_fn argument for a callable -- you can hack your way through it. 对于subprocess.Popen ，有preexec_fn一个可调用的参数-你可以通过它破解你的方式。

Alternatively, take a look at cgroups (control groups) -- I believe they can handle tricky situations such as daemon creation and so forth. 或者，看看cgroups （控制组） - 我相信他们可以处理棘手的情况，例如守护进程创建等等。

Answer 2

Given the constraints from my original post, I used the following approach: 考虑到我原帖的限制，我使用了以下方法：

putenv("PID_DIR", <some tempdir>)
For the current process, override fork and clone with versions which will trace the process start time to $PID_DIR/<pid> . 对于当前进程，使用将跟踪进程开始时间到$PID_DIR/<pid>版本覆盖fork和clone 。 The override is done using plthook and applies to all loaded shared objects. 使用plthook完成覆盖并应用于所有已加载的共享对象。 dlopen should also be overridden to override the functions on any other dynamically loaded libraries. 还应该重写dlopen以覆盖任何其他动态加载的库上的函数。
Set a library with implementations of __libc_start_main , fork , and clone as LD_PRELOAD . 将具有__libc_start_main ， fork和clone实现的库设置为LD_PRELOAD 。

An initial implementation is available here used like: 这里有一个初始实现，如：

import process_tracker; process_tracker.install()

import os

pid1 = os.fork()
pid2 = os.fork()
pid3 = os.fork()

if pid1 and pid2 and pid3:
    print(process_tracker.children())

如何在Linux中跟踪所有后代进程

问题描述

2 个解决方案

解决方案1
0 2019-05-07 22:25:44

解决方案2
0 已采纳 2019-05-24 04:55:06

如何在Linux中跟踪所有后代进程

问题描述

2 个解决方案

解决方案1 0 2019-05-07 22:25:44

解决方案2 0 已采纳 2019-05-24 04:55:06

解决方案1
0 2019-05-07 22:25:44

解决方案2
0 已采纳 2019-05-24 04:55:06