简体繁体 English

从多线程应用程序生成进程

[英]Spawn process from multithreaded application

原文 2011-11-16 12:58:59 3 2 c/ linux/ unix/ process/ fork

I have a situation where I need to spawn a helper process from within a very large, multithreaded application, which I do not have complete control over. 我有一种情况，我需要从一个非常大的多线程应用程序中生成一个帮助程序进程，我没有完全控制。

Right now I'm using fork() / exec() . 现在我正在使用fork() / exec() 。 This works a lot of the time, but in some circumstances the child crashes weirdly before the exec() happens. 这在很多时候都有效，但在某些情况下，在exec()发生之前，孩子会很奇怪地崩溃。 I'm suspecting this is because fork() ing multithreaded applications is generally considered to be a Really Bad Idea. 我怀疑这是因为fork()多线程应用程序通常被认为是一个非常糟糕的想法。

I would really, really like a way to start a process atomically, without fork() ing the parent: with all file descriptors closed, environment set up the way I want, CWD set, etc. This should avoid all the horror of fork() ing my multithreaded parent app, and dealing with file descriptor inheritance, etc. posix_spawn() should be ideal. 我真的非常喜欢以原子方式启动进程的方法，没有fork()父进程：关闭所有文件描述符，环境设置我想要的方式，CWD设置等等。这应该避免fork()所有恐怖fork()我的多线程父应用程序，处理文件描述符继承等， posix_spawn()应该是理想的。 Unfortunately, on Linux, posix_spawn() is implemented using fork() and exec() ... 不幸的是，在Linux上， posix_spawn()是使用fork()和exec() ...

vfork() is defined to suspend the parent process until the child calls exec() . vfork()被定义为挂起父进程，直到子进程调用exec() 。 This would appear to be more like what I want, but my understanding was that vfork() is generally considered a historical relic these days and is equivalent to fork() --- is this still the case? 这似乎更像我想要的，但我的理解是， vfork()现在通常被认为是历史遗物，相当于fork() ---这仍然是这样吗？

What's the least bad way of dealing with this? 处理这个问题的最不好的方法是什么？

Note that: 注意：

I cannot spawn my process before any threads start (because I can't run code at that point) 我不能在任何线程启动之前生成我的进程（因为我不能在那时运行代码）
I cannot redesign my application not to need the helper process, due to external requirements 由于外部要求，我无法重新设计我的应用程序而不需要辅助进程
I cannot suspend all my threads before spawning the helper process, because they don't belong to me 在产生帮助程序进程之前，我无法暂停所有线程，因为它们不属于我

This is on Linux. 这是在Linux上。 Java is involved, but all my code is in C. 涉及Java，但我的所有代码都在C中。

2 个解决方案

fork'ing multithreaded application is considered to be safe if you use only async-signal-safe operations. 如果仅使用异步信号安全操作，则fork'ing多线程应用程序被认为是安全的。 POSIX says : POSIX 说：

A process shall be created with a single thread. 应使用单个线程创建进程。 If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. 如果多线程进程调用fork（），则新进程应包含调用线程的副本及其整个地址空间，可能包括互斥锁和其他资源的状态。 Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called. 因此，为了避免错误，子进程可能只执行异步信号安全操作，直到调用其中一个exec函数为止。 Fork handlers may be established by means of the pthread_atfork() function in order to maintain application invariants across fork() calls. 可以通过pthread_atfork（）函数建立fork处理程序，以便跨fork（）调用维护应用程序不变量。

posix_spawn() is not the best idea: posix_spawn（）不是最好的主意：

It is also complicated to modify the environment of a multi-threaded process temporarily, since all threads must agree when it is safe for the environment to be changed. 临时修改多线程进程的环境也很复杂，因为所有线程必须在安全环境被更改时达成一致。 However, this cost is only borne by those invocations of posix_spawn() and posix_spawnp() that use the additional functionality. 但是，此成本仅由使用附加功能的posix_spawn（）和posix_spawnp（）调用承担。 Since extensive modifications are not the usual case, and are particularly unlikely in time-critical code, keeping much of the environment control out of posix_spawn() and posix_spawnp() is appropriate design. 由于大量修改不是通常的情况，并且在时间关键代码中特别不可能，因此将大部分环境控制保留在posix_spawn（）和posix_spawnp（）之外是适当的设计。

(see man posix_spawn ) （见man posix_spawn ）

I guess you have problems with replicated from parent resources. 我猜你从父资源复制有问题。 You may clean them up using pthread_atfork() handler (you use pthread, right?). 您可以使用pthread_atfork（）处理程序清理它们（使用pthread，对吧？）。 The other way is to use low level function for process creation called clone() . 另一种方法是使用低级函数来创建名为clone（）的进程。 It gives you almost full control on what exactly child process should inherit from its parent. 它几乎可以完全控制子进程应该从其父进程继承的内容。

[UPDATE] [UPDATE]

Probably the simplest way to get rid of the problem is to change your fork'ing scheme. 解决问题的最简单方法可能就是改变你的分叉方案。 For example you can create a new process (fork) even before your program initializes all resources. 例如，即使在程序初始化所有资源之前，您也可以创建一个新进程（fork）。 Ie call fork() in main() before you create all your threads. 即在创建所有线程之前，在main（）中调用fork（）。 In child process setup a signal handler (for example for SIGUSR2 signal) and sleep. 在子进程中设置信号处理程序（例如用于SIGUSR2信号）和休眠。 When parent needs to exec some new process, it sends the SIGUSR2 signal to your child process. 当父需要执行一些新进程时，它会将SIGUSR2信号发送到您的子进程。 When child catches it, it calls fork/exec. 当孩子抓到它时，它会调用fork / exec。

Calling fork should be safe if you limit yourself to "raw" system calls ( syscall(SYS_fork) , syscalll(SYS_execve, ...) , etc.). 如果您将自己限制为“原始”系统调用（ syscall(SYS_fork) ， syscalll(SYS_execve, ...)等） syscalll(SYS_execve, ...) 则调用fork应该是安全的。 Call into any glibc routine, and you'll be in a lot of trouble. 调用任何glibc例程，你会遇到很多麻烦。

Calling vfork is not at all what you want: only the thread that called vfork is suspended, and other threads will continue to run (and in the same address space as the vforked child). 调用vfork根本不是你想要的：只有调用vfork的线程被挂起，其他线程将继续运行（和vforked子节点在同一地址空间）。 This is very likely to complicate your life. 这很可能会使你的生活变得复杂。

Calling clone directly is possible, but exceedingly tricky. 直接调用clone是可能的，但非常棘手。 We have an implementation that allows for safe forking of child processes from multithreaded apps (unfortunately not open source). 我们有一个实现，允许从多线程应用程序安全分叉子进程（遗憾的是不是开源）。 That code is very tricky, and surprisingly long. 该代码非常棘手，并且令人惊讶地长。