简体   繁体   中英

Are system calls directly send to the kernel?

I have a couple of assumptions, most likely some of them will be incorrect. Please correct me where they are wrong.

We could categorize the functions in a program written in C as follows:

  • Functions that are sent to dynamically loaded libraries:
  1. These are sent to the library that translates them in to multiple standard C-functions
  2. The library passes them on to libc where they are translated into multiple system calls.
  3. Libc passes those on to the kernel where they are executed and the returns are sent back to libc.
  4. Libc will collect the returs, group them by c-function and use them to create 1 return for each c-function. These returns will be send back to the dynamically loaded library.
  5. This library will collect all returns and use them to create 1 return that is send back to the original program.
  • Functions that are either defined in the code or part of statically compiled libraries: Everything is the same as the category above but:
  1. They program already does the translation into standard C functions where they are known or into functions calling dynamically loaded libraries in the other case.
  2. The standard c functions are send to libc, the others to the dynamically loaded libraries (where they will be handled as above).
  3. The creation of 1 final return based on the returns from both types of functions will be done by the program
  • Functions that are standard C functions: They will just be sent to libc which will communicate with the kernel in the same way as above and 1 return will be sent to the program

  • Functions that are system calls: They are NOT sent directly to the kernel but have to pass to libc although it doesn't do any extra work.

Security checks (permissions, writing to unallocated mem, ...) are always done by the kernel, although libc and libraries above might also check it first.

All POSIX-compliant systems follow these rules

It might not be the same on Linux and on some other POSIX system (like FreeBSD).

On Linux, the ABI defines how a system call is done. Read about Linux kernel interfaces . The system calls are listed in syscalls(2) (but see also /usr/include/asm*/unistd.h ...). Read also vdso(7) . The assembler HowTo explains more details, but for 32 bits i686 only.

Most Linux libc are free software, you can study their source code. IMHO the source code of musl-libc is very readable.

To simplify a tiny bit, most system calls (eg write(2) ) are small C functions in the libc which:

  1. call the kernel using SYSENTER machine instruction (and take care of passing the system call number and its arguments with the kernel convention, which is not the usual C ABI). What the kernel considers as a system call is only that machine instruction (and conventions about it).

  2. handle the failure case, by passing it to errno(3) and returning -1 .
    (IIRC, on failure, the carry -or perhaps the overflow- flag bit is set when the kernel returns from SYSENTER ; but I could be wrong in the details)

  3. handle the success case, by returning a result.

You could invoke system calls without libc , with some assembler code. This is unusual, but has been done (eg in BusyBox or in Bones ).

So the libc code for write is doing some tiny extra work (passing arguments, handling failure & errno and success cases).

Some few system calls (probably getpid & clock_gettime ) avoid the overhead of the SYSENTER machine instruction (and user-mode -> kernel-mode switch) thanks to vDSO .

No you can't categorize things like that. When you program in C (but that makes no difference in almost all other languages), there is only functions and whatever is the real status of these, you call them exactly the same way. This is defined by ABI (how to pass parameters, get returned values, etc) and enforced by the compiler/linker. Of course some functions are just stubs. For example stubs to shared libraries functions (stubs may be need to load the library, dynamic link to the real function, etc) or system calls (this is more technical and differs from kernel to kernel). But from the viewpoint of your program everything is the same (this is why it is hard to understand difference between fread and read at the beginning: you call them the same way, they make almost the same job, what's the difference?).

POSIX doesn't say a single word about kernels... It just lists the C (and formerly ADA) API of a set of functions with minimal semantic (plus some command, tools, etc). Implementation of these is totally free.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM