简体   繁体   中英

Why is sys_fork not used by glibc's implementation of fork?

In eglibc's nptl/sysdeps/unix/sysv/linux/i386/fork.c there's a definition:

#define ARCH_FORK() \
  INLINE_SYSCALL (clone, 5,                           \
          CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD, 0,     \
          NULL, NULL, &THREAD_SELF->tid)

which is used in actual __libc_fork() as the heart of the implementation. But eg in Linux's arch/x86/entry/syscalls/syscall_32.tbl exists a sys_fork entry, as well as in syscalls_64.tbl . So apparently Linux does have its special syscall for fork .

So I now wonder: why does glibc implement fork() in terms of clone , if the kernel already provides the fork syscall?

I looked at the commit where Ulrich Drepper added that code to glibc, and there wasn't any explanation in the commit log (or elsewhere).

Have a look at Linux's implementation of fork , though:

return _do_fork(SIGCHLD, 0, 0, NULL, NULL, 0);

And here is clone :

return _do_fork(clone_flags, newsp, 0, parent_tidptr, child_tidptr, tls);

Obviously, they are almost exactly the same. The only difference is that when calling clone , you can set various flags, can specify a stack size for the new process, etc. fork doesn't take any arguments.

Looking at Drepper's code, the clone flags are CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD . If fork was used, the only flag would be SIGCHLD .

Here is what the clone manpage says about those extra flags:

CLONE_CHILD_CLEARTID (since Linux 2.5.49)
          Erase child thread ID at location ctid in child memory when  the  child
          exits,  and  do  a  wakeup  on  the futex at that address.  The address
          involved may be changed by the set_tid_address(2) system call.  This is
          used by threading libraries.

CLONE_CHILD_SETTID (since Linux 2.5.49)
          Store child thread ID at location ctid in child memory.

...And you can see that he does pass a pointer to where the kernel should first store the child's thread ID and then later do a futex wakeup. Is glibc doing a futex wait on that address somewhere? I don't know. If so, that would explain why Drepper chose to use clone .

(And if not, it would be just one more example of the extreme accumulation of cruft which is our beloved glibc! If you wanted to find some nice, clean, well-maintained code, just keep moving and go have a look at musl libc!)

In a nutshell: why not?

You have one syscall that is guaranteed to exist on all platforms (you do realize that Intel isn't the only platform out there, right?), and another that is deprecated because it is unnecessary. They both carry the exact same semantics. Your code is much more compact when you only call the one guaranteed to exist.

I will elaborate on that a little.

Fork is defined by Posix, while clone is Linux specific. However, Linux, on occasion, takes Posix defined "system calls" and implements them in user space. Such is the case for fork (and vfork and pthread_create). They are all implemented in user space by calling "clone".

As such, fork is deemed unnecessary at the kernel level. If a thin user space wrapper can implement it, the kernel is okay with that. As such, on Linux , clone is guaranteed to exist on all platforms, while fork may or may not exist, depending on specific platform.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM