I am creating a syscall tracer using seccomp . I don't change anything in the system call, I just log it in my structure and when the process finishes - I dump this structure on a disk.
When I run my program like this (it's called tracer ):
tracer env
Everything works well, and I see the logs in the file after. However, if I try to trace a program which calls execve
inside, it fails:
tracer watch -n1 env
or
tracer strace -o /tmp/log env
fails with the stdout
env: error while loading shared libraries: cannot create cache for search path: Cannot allocate memory
and the log:
$ cat /tmp/log
execve("/usr/bin/env", ["env"], [/* 19 vars */]) = 0
brk(NULL) = 0x415000
mmap(0xffffffffffffffda, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2
writev(103, [{iov_base="env", iov_len=3}, {iov_base=": ", iov_len=2}, {iov_base="error while loading shared libraries", iov_len=36}, {iov_base=": ", iov_len=2}, {iov_base="", iov_len=0}, {iov_base="", iov_len=0}, {iov_base="cannot create cache for search path", iov_len=35}, {iov_base=": ", iov_len=2}, {iov_base="Cannot allocate memory", iov_len=22}, {iov_base="\n", iov_len=1}], 10) = 127
+++ exited with 127 +++
Notice the weird mmap
address and its return value. I don't understand what is wrong and why does this happen. Any other program works fine, so I guess the problem is with copying seccomp
filters to the forked process which calls execve
.
Here are my seccomp
rules:
struct sock_filter filter[] = {
BPF_STMT(BPF_LD + BPF_W + BPF_ABS, offsetof(struct seccomp_data, nr)),
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, __NR_openat, 0, 1),
BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_TRACE),
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, __NR_write, 0, 1),
BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_TRACE),
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, __NR_mmap, 0, 1),
BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_TRACE),
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, __NR_mprotect, 0, 1),
BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_TRACE),
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, __NR_close, 0, 1),
BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_TRACE),
BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_ALLOW),
};
I don't list the whole code as it is obvious and can be only written in a single way, also, it is written in the article I referred to above. The problem is also known in the Internet but I was not able to find any solution. If you still insist on the whole code (I doubt that) or MCVE, I can provide it.
Also, when I add the execve
trace I have different behavior:
struct sock_filter filter[] = {
BPF_STMT(BPF_LD + BPF_W + BPF_ABS, offsetof(struct seccomp_data, nr)),
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, __NR_openat, 0, 1),
BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_TRACE),
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, __NR_write, 0, 1),
BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_TRACE),
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, __NR_mmap, 0, 1),
BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_TRACE),
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, __NR_mprotect, 0, 1),
BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_TRACE),
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, __NR_close, 0, 1),
BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_TRACE),
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, __NR_execve, 0, 1),
BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_TRACE),
BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_ALLOW),
};
The log becomes:
$ cat /tmp/log
execve(0xffffffffffffffda, ["env"], [/* 19 vars */]) = -1 ENOSYS (Function not implemented)
getpid() = 15535
exit_group(1) = ?
+++ exited with 1 +++
Linux 4.4 aarch64, Linux 4.15 x86-64
The more time I spend on this problem, the more I realize that the problem is actually in the kernel's source code. It copies the filters from one process to another , child one but they don't copy the implementation, and so all of the SECCOMP_RET_TRACE
rules are copied and there is no tracer in the child, so every system call in the subchild returns -ENOSYS
as there is no tracer there, however, the rules are copied.
I have found a way to solve this problem. To set up the tracer for children processes as well or at least to avoid the ENOSYS
problem for sub-children, we can specify the PTRACE_O_TRACEFORK
and PTRACE_O_TRACECLONE
flag while setting ptrace options like that:
ptrace(PTRACE_SETOPTIONS, child, 0, PTRACE_O_TRACESECCOMP | PTRACE_O_TRACEFORK | PTRACE_O_TRACECLONE);
The reason why we need to add both is not easy to explain briefly. At first, it is architecture and libc -dependent which syscalls are present in the system and which are used by the programs (usually, through the libc implementation). Perhaps, even this list is not full: we may also have to track VFORK
and other ways related to cloning (or spawning) a thread or a process (remember, thread are light-weight processes in Linux). So, what these options do is specified in the man
:
PTRACE_O_TRACECLONE
(since Linux 2.5.46) Stop the tracee at the next clone(2) and automatically start tracing the newly cloned process, which will start with aSIGSTOP
, orPTRACE_EVENT_STOP
ifPTRACE_SEIZE
was used. A waitpid(2) by the tracer will return a status value such thatstatus>>8 == (SIGTRAP | (PTRACE_EVENT_CLONE<<8))
The PID of the new process can be retrieved with
PTRACE_GETEVENTMSG
. This option may not catch clone(2) calls in all cases. If the tracee calls clone(2) with theCLONE_VFORK
flag,PTRACE_EVENT_VFORK
will be delivered instead ifPTRACE_O_TRACEVFORK
is set; otherwise if the tracee calls clone(2) with the exit signal set toSIGCHLD
,PTRACE_EVENT_FORK
will be delivered ifPTRACE_O_TRACE‐FORK
is set.
The reason why it works in my case is that after simple cloning, seccomp
rules were copied to the cloned process, but the tracer wasn't. By specifying these flags, the parent process becomes the tracer automatically for every child process, and so, as rules are copied, and tracer is specified, everything works like a charm.
NOTE As using this way the parent process becomes the tracer, you will also need to wait for all children and sub-children, not only the process you actually spawned. To do this, use -1
as a pid argument in waitpid
or similar syscalls:
const pid_t childWaited = waitpid(-1, &status, 0);
// but not const pid_t result = waitpid(myChildPid, &status, 0);
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.