I have a couple of assumptions, most likely some of them will be incorrect. Please correct me where they are wrong.
We could categorize the functions in a program written in C as follows:
Functions that are standard C functions: They will just be sent to libc which will communicate with the kernel in the same way as above and 1 return will be sent to the program
Functions that are system calls: They are NOT sent directly to the kernel but have to pass to libc although it doesn't do any extra work.
Security checks (permissions, writing to unallocated mem, ...) are always done by the kernel, although libc and libraries above might also check it first.
All POSIX-compliant systems follow these rules
It might not be the same on Linux and on some other POSIX system (like FreeBSD).
On Linux, the ABI defines how a system call is done. Read about Linux kernel interfaces . The system calls are listed in syscalls(2) (but see also /usr/include/asm*/unistd.h
...). Read also vdso(7) . The assembler HowTo explains more details, but for 32 bits i686 only.
Most Linux libc are free software, you can study their source code. IMHO the source code of musl-libc is very readable.
To simplify a tiny bit, most system calls (eg write(2) ) are small C functions in the libc
which:
call the kernel using SYSENTER
machine instruction (and take care of passing the system call number and its arguments with the kernel convention, which is not the usual C ABI). What the kernel considers as a system call is only that machine instruction (and conventions about it).
handle the failure case, by passing it to errno(3) and returning -1
.
(IIRC, on failure, the carry -or perhaps the overflow- flag bit is set when the kernel returns from SYSENTER
; but I could be wrong in the details)
handle the success case, by returning a result.
You could invoke system calls without libc
, with some assembler code. This is unusual, but has been done (eg in BusyBox or in Bones ).
So the libc
code for write
is doing some tiny extra work (passing arguments, handling failure & errno
and success cases).
Some few system calls (probably getpid
& clock_gettime
) avoid the overhead of the SYSENTER
machine instruction (and user-mode -> kernel-mode switch) thanks to vDSO .
No you can't categorize things like that. When you program in C (but that makes no difference in almost all other languages), there is only functions and whatever is the real status of these, you call them exactly the same way. This is defined by ABI (how to pass parameters, get returned values, etc) and enforced by the compiler/linker. Of course some functions are just stubs. For example stubs to shared libraries functions (stubs may be need to load the library, dynamic link to the real function, etc) or system calls (this is more technical and differs from kernel to kernel). But from the viewpoint of your program everything is the same (this is why it is hard to understand difference between fread
and read
at the beginning: you call them the same way, they make almost the same job, what's the difference?).
POSIX doesn't say a single word about kernels... It just lists the C (and formerly ADA) API of a set of functions with minimal semantic (plus some command, tools, etc). Implementation of these is totally free.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.