简体   繁体   English

Linux Kernel 4.2.x:为什么检查时预期的系统调用地址与实际地址不匹配?

[英]Linux Kernel 4.2.x: Why does the expected system call address not match the actual address when checked?

Short Background简短的背景

I'm currently writing a linux kernel module as a project to better understand linux kernel internals.我目前正在编写一个 linux 内核模块作为一个项目,以更好地理解 linux 内核内部结构。 I've written 'hello world'-type modules before, but I want to get beyond that, so I'm trying to replace some common system calls like open , read , write , and close with my own so that I can print a bit more information into the system log.我以前写过“hello world”类型的模块,但我想超越它,所以我试图用我自己的替换一些常见的系统调用,如openreadwriteclose ,以便我可以print一个将更多信息写入系统日志。

Some content I found while searching was either pre-2.6 kernel, which is not useful because the sys_call_table symbol stopped being exported starting on kernel 2.6.x.我在搜索时发现的一些内容是 2.6 之前的内核,这没有用,因为sys_call_table符号在内核 2.6.x 上停止导出。 On the other hand, those that I found for 2.6.x or later appear seem to have problems of their own, even though they apparently worked at the time.另一方面,我为 2.6.x 或更高版本找到的那些似乎有自己的问题,即使它们当时显然有效。

One particular O'Reilly article , which I found on the sys_call_table in linux kernel 2.6.18 post, suggests that what I'm trying to do ought to work, but it isn't.在 linux kernel 2.6.18 post 中的sys_call_table上找到的一篇特别的O'Reilly 文章表明我正在尝试做的事情应该有效,但事实并非如此。 (Specifically, see the Intercepting sys_unlink() Using System.map section.) (具体参见Intercepting sys_unlink() Using System.map部分。)

I also read through the Linux Kernel: System call hooking example and Kernel sys_call_table address does not match address specified in system.map which, while somewhat informative, were not useful for me.我还通读了Linux Kernel: System call hooking exampleKernel sys_call_table address does not match address specified in system.map虽然有些信息量,但对我没有用。

Problems and Questions问题和疑问

Part 1 - Unexpected Address Mismatch第 1 部分 - 意外的地址不匹配

I'm using Linux kernel 4.2.0-16-generic on a Kubuntu 15.10 x86_64 architecture installation.我在 Kubuntu 15.10 x86_64 架构安装上使用 Linux 内核 4.2.0-16-generic。 Since the sys_call_table symbol is no longer exported, I grep ped the address from the system map file:由于sys_call_table符号,不再出口,我grep PED从系统映射文件地址:

# grep 'sys_call_table' < System.map-4.2.0-16-generic
ffffffff818001c0 R sys_call_table
ffffffff81801580 R ia32_sys_call_table

With this in hand, I added the following line to my kernel module:有了这个,我在内核模块中添加了以下行:

static unsigned long *syscall_table = (unsigned long *) 0xffffffff818001c0;

Based on this, I was expecting that a simple check would actually confirm that I was actually pointing to the location I thought I was pointing to, ie the base address of the kernel's unexported sys_call_table .基于此,我期待一个简单的检查实际上会确认我实际上指向了我认为指向的位置,即内核未导出的sys_call_table的基地址。 So, I wrote a simple check like the one below into the module's init function to verify:所以,我在模块的 init 函数中写了一个像下面这样的简单检查来验证:

if(syscall_table[__NR_close] != (unsigned long *)sys_close)
{
        pr_info("sys_close = 0x%p, syscall_table[__NR_close] = 0x%p\n", sys_close, syscall_table[__NR_close]);
        return -ENXIO;
}

This check failed and different addresses were printed in the log.此检查失败,日志中打印了不同的地址。

I was not expecting the body of this if statement to get executed because I thought the address returned by syscall_table[__NR_close] would be the same as that of sys_close , but it does enter.想到的这个身体if得到执行的语句,因为我觉得通过返回的地址syscall_table[__NR_close]将是相同的sys_close ,但它确实进入。

Q1: Have I missed something so far regarding the expected address-based comparison?问题 1:到目前为止,我是否遗漏了有关预期的基于地址的比较的内容? If so, what?如果是这样,是什么?

Part 2 - Partially Successful?第 2 部分 - 部分成功?

If I remove this check, it seems I'm partially successful, because, apparently, I can at least replace the read call successfully using the code below:如果我删除此检查,似乎我部分成功,因为显然,我至少可以使用以下代码成功替换read调用:

static asmlinkage ssize_t (*original_read)(unsigned int fd, char __user *buf, size_t count);
// ...
static void systrap_replace_syscalls(void)
{
    pr_debug("systrap: replacing system calls\n");

    original_read  = syscall_table[__NR_read];
    original_write = syscall_table[__NR_write];
    original_close = syscall_table[__NR_close];

    write_cr0(read_cr0() & ~0x10000);

    syscall_table[__NR_read]  = systrap_read;
    syscall_table[__NR_write] = systrap_write;
    syscall_table[__NR_close] = systrap_close;

    write_cr0(read_cr0() | 0x10000);

    pr_debug("systrap: system calls replaced\n");
}

My replacement functions simply print a message and forward the call to the actual system call.我的替换函数只是打印一条消息并将调用转发到实际的系统调用。 For example, the read replacement function's code is below:例如,读取替换函数的代码如下:

static asmlinkage ssize_t systrap_read(unsigned int fd, char __user *buf, size_t count)
{
        pr_debug("systrap: reading from fd = %u\n", fd);
        return original_read(fd, buf, count);
}

And the system log shows the following output when I insmod and rmmod the module:当我insmodrmmod模块时,系统日志显示以下输出:

kernel: [23226.797460] systrap: setting up module
kernel: [23226.797462] systrap: replacing system calls
kernel: [23226.797464] systrap: system calls replaced
kernel: [23226.797465] systrap: module setup complete
kernel: [23226.864198] systrap: reading from fd = 4279272912

<similar output ommitted for brevity>

kernel: [23235.560663] systrap: reading from fd = 2835745072
kernel: [23235.564774] systrap: reading from fd = 861079840
kernel: [23235.564986] systrap: cleaning up module
kernel: [23235.564990] systrap: trying to restore system calls
kernel: [23235.564993] systrap: restored sys_read
kernel: [23235.564995] systrap: restored sys_write
kernel: [23235.564997] systrap: restored sys_close
kernel: [23235.565000] systrap: system call restoration attempt complete
kernel: [23235.565002] systrap: module cleanup complete

I can let it run for a long time and, oddly enough, I never observe entries for the write and close function calls --only for the read s, which is why I thought I was only partially successful.我可以让它运行很长时间,而且奇怪的是,我从来没有观察到writeclose函数调用的条目——仅用于read ,这就是为什么我认为我只是部分成功。

Q2: Have I missed something regarding the replaced system calls? Q2:我是否遗漏了有关替换系统调用的内容? If so, what?如果是这样,是什么?

Part 3 - Unexpected Error Message on rmmod Command第 3 部分 - rmmod命令上的意外错误消息

Even though the module seems to operate normally, I always get the following error when I rmmod the module from the kernel:尽管该模块似乎运行正常,但当我从内核rmmod模块时,总是出现以下错误:

rmmod: ERROR: ../libkmod/libkmod.c:506 lookup_builtin_file() could not open builtin file '(null)/modules.builtin.bin'

My module cleanup function simply calls another one (below) that tries to restore the function calls by doing the opposite of the replacement function above:我的模块清理函数只是调用另一个(下面),它试图通过执行与上面的替换函数相反的操作来恢复函数调用:

// called by the exit function
static void systrap_restore_syscalls(void)
{
    pr_debug("systrap: trying to restore system calls\n");
    write_cr0(read_cr0() & ~0x10000);

    /* make sure no other modules have made changes before restoring */
    if(syscall_table[__NR_read] == systrap_read)
    {
            syscall_table[__NR_read] = original_read;
            pr_debug("systrap: restored sys_read\n");
    }
    else
    {
            pr_warn("systrap: sys_read not restored; address mismatch\n");
    }
    // ... ommitted: same stuff for other sys calls

    write_cr0(read_cr0() | 0x10000);
    pr_debug("systrap: system call restoration attempt complete\n");
}

Q3: I don't know what causes the error message; Q3:不知道是什么原因导致错误信息; any ideas here?这里有什么想法吗?

Part 4 - sys_open Marked for Deprecation?第 4 部分 - sys_open标记为弃用?

In another unexpected turn of events, I find that the __NR_open macro is no longer be defined by default.在另一个意外事件中,我发现__NR_open宏不再默认定义。 In order for me to see the definition, I have to #define __ARCH_WANT_SYSCALL_NO_AT before #include ing the header files:为了让我看到定义,我必须在#include头文件之前#define __ARCH_WANT_SYSCALL_NO_AT

/*
 * Force __NR_open definition. It seems sys_open has been replaced by sys_openat(?)
 * See include/uapi/asm-generic/unistd.h:724-725
 */
#define __ARCH_WANT_SYSCALL_NO_AT

#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/module.h>
// ...

Going through the kernel source code (mentioned in comment above), you find the following comments:浏览内核源代码(在上面的评论中提到),您会发现以下评论:

/*
* All syscalls below here should go away really,
* these are provided for both review and as a porting
* help for the C library version.
*
* Last chance: are any of these important enough to
* enable by default?
*/
#ifdef __ARCH_WANT_SYSCALL_NO_AT
#define __NR_open 1024
__SYSCALL(__NR_open, sys_open)
// ...

Can anyone clarify:任何人都可以澄清:

Q4: ...the comments above on why __NR_open is not available by default?, Q4:...上面关于为什么__NR_open默认不可用的评论?,

Q5: ...whether it's a good idea to do what I'm doing with the #define ?, and Q5:...使用#define做我正在做的事情是否是个好主意?,以及

Q6: ...what I should be using instead if I really shouldn't be trying to use __NR_open ? Q6:...如果我真的不应该尝试使用__NR_open我应该使用__NR_open

Epiloge - Crashing My System 😑尾声 - 使我的系统崩溃 😑

I tried using __NR_openat , replacing that call as I had done with the previous ones:我尝试使用__NR_openat ,像以前一样替换该调用:

static asmlinkage long systrap_openat(int dfd, const char __user *filename, int flags, umode_t mode)
{
    pr_debug("systrap: opening file dfd = %d, name = % s\n", filename);
    return original_openat(dfd, filename, flags, mode);
}

But this simply helped me unceremoniously crash my own system 😑 by causing other processes to segfault when they tried to open a file, with gems such as:但这只是帮助我毫不客气地使我自己的系统崩溃 😑 通过导致其他进程在尝试打开文件时出现段错误,例如:

kernel: [135489.202693] systrap: opening file dfd = 0, name = P^Q
kernel: [135489.202913] zsh[11806]: segfault at 410 ip 00007f3a380abe60 sp 00007ffd04c5b550 error 4 in libc-2.21.so[7f3a37fe1000+1c0000]

Trying to print argument data also showed odd/garbage info.尝试打印参数数据也显示奇数/垃圾信息。

Q7: Any additional suggestions on why it would suddenly crash and why the arguments seem to be garbage-like? Q7:关于为什么它会突然崩溃以及为什么这些参数看起来像垃圾一样,还有什么建议吗?

I've spent several days trying to work through this and I just hope I've not missed something utterly stupid...我花了几天时间试图解决这个问题,我只是希望我没有错过一些非常愚蠢的事情......

Please, let me know if something's not entirely clear to you in the comments and I'll attempt to clarify.请让我知道如果您在评论中不完全清楚,我会尝试澄清。

I'd be most helpful if you could provide some code snippets that actually work and/or point me in a precise-enough direction that would allow me to understand what I'm doing wrong and how to quickly get this fixed.如果您能提供一些实际工作的代码片段和/或指向我足够精确的方向,让我了解我做错了什么以及如何快速解决这个问题,我会非常有帮助。

I've managed to complete this and I'm now taking the time to document my findings.我已经设法完成了这项工作,现在我正在花时间记录我的发现。

Q1: Have I missed something so far regarding the expected address-based comparison?问题 1:到目前为止,我是否遗漏了有关预期的基于地址的比较的内容?

The problem with this comparison is that, after checking /proc/kallsyms , I saw that sys_close and other related symbols are also no longer exported.这个比较的问题是,在检查/proc/kallsyms ,我看到sys_close和其他相关符号也不再导出。 I already knew this for some symbols, but I was still under the (mistaken) impression that some others were still available.我已经知道一些符号的这一点,但我仍然(错误地)认为其他一些符号仍然可用。 So the check I was using (below) evaluates to true and causes the module to fail the 'safety' check.所以我使用的检查(下面)评估为真并导致模块未能通过“安全”检查。

if(syscall_table[__NR_close] != (unsigned long *)sys_close)
{
        /* ... */
}

In short, you simply need to trust the assumption about the system call table address retrieved from the System.map-$(uname -r) file.简而言之,您只需要相信从System.map-$(uname -r)文件中检索到的系统调用表地址的假设。 The 'safety' check is unnecessary and will also not work as expected. “安全”检查是不必要的,也不会按预期工作。

Q2: Have I missed something regarding the replaced system calls? Q2:我是否遗漏了有关替换系统调用的内容?

This problem was eventually traced to either one or both of the following header files I had included (I didn't bother to figure out which one.):这个问题最终被追溯到我包含的以下头文件中的一个或两个(我没有费心去弄清楚是哪一个。):

#include <uapi/asm-generic/unistd.h>
#include <uapi/asm-generic/errno-base.h>

These were causing the __NR_* macros to get redefined, and therefore expanded, to incorrect values --at least for the x86_64 architecture.这些导致__NR_*宏被重新定义,因此被扩展为不正确的值——至少对于 x86_64 架构。 For example, the indices for sys_read and sys_write in the system call table are supposed to be 0 and 1 respectively, but they were getting other values and ended up indexing to completely unexpected locations in the table.例如,系统调用表中sys_readsys_write的索引应该分别为01 ,但它们获得了其他值并最终索引到表中完全意外的位置。

Just removing the header files above fixed the issue without additional code changes.只需删除上面的头文件即可解决问题,而无需更改其他代码。

Q3: I don't know what causes the error message; Q3:不知道是什么原因导致错误信息; any ideas here?这里有什么想法吗?

The error message was a side-effect of the previous issue.错误消息是上一问题的副作用。 Obviously, the fact that the system call table was being indexed incorrectly (see Q2 ) caused other locations in memory to get modified.显然,系统调用表的索引不正确(参见Q2 )导致内存中的其他位置被修改。

Q4: ...the comments above on why __NR_open is not available by default? Q4: ...上面关于为什么__NR_open默认不可用的评论?

This was a mis-report of the IDE, which I stopped using.这是我停止使用的 IDE 的错误报告。 The __NR_open macro was already defined; __NR_open宏已经定义; the fix on Q2 made it even more obvious. Q2的修复使它更加明显。

Q5: ...whether it's a good idea to do what I'm doing with the #define ? Q5: ...使用#define做我正在做的事情是否是个好主意?

Short answer: No, not a good idea and definitely not needed.简短回答:不,不是一个好主意,绝对不需要。 See Q2 above.参见上面的Q2

Q6: ...what I should be using instead if I really shouldn't be trying to use __NR_open Q6:...如果我真的不应该尝试使用__NR_open我应该使用__NR_open

Based on answers to previous questions, this is not a problem.根据之前问题的答案,这不是问题。 Using __NR_open is just fine and expected.使用__NR_open很好并且可以预期。 This part had gotten messed up due to the header files in Q2由于第二季度的头文件,这部分已经搞砸了

Q7: Any additional suggestions on why it would suddenly crash and why the arguments seem to be garbage-like? Q7:关于为什么它会突然崩溃以及为什么这些参数看起来像垃圾一样,还有什么建议吗?

The use of __NR_openat and the crashes was likely being caused by the macro being expanded to an incorrect value (see Q2 again). __NR_openat的使用和崩溃可能是由宏被扩展为错误值引起的(再次参见Q2 )。 However, I can say that I had no real need to use it.但是,我可以说我没有真正需要使用它。 I was supposed to be using __NR_open as specified above, but was trying out __NR_openat as a workaround for the issue fixed in Q2 .我应该按照上面的说明使用__NR_open ,但尝试使用__NR_openat作为解决Q2 中修复的问题的解决方法。

In short, the answer to Q2 helped fix several issues in a cascading effect.简而言之, Q2的答案有助于解决级联效应中的几个问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 内核 sys_call_table 地址与 system.map 中指定的地址不匹配 - Kernel sys_call_table address does not match address specified in system.map Kernel模块在读取系统调用表function地址时崩溃 - Kernel module crash when reading system call table function address Linux内核 - 为什么System.map中的函数地址是实时看到的地址之前的一个字节? - Linux Kernel - why a function's address in System.map is one byte preceding its address as seen in real time? Linux 内核如何设置 PCI BAR 以确保不存在地址冲突? - How does Linux kernel set PCI BARs so that there is no address conflict? Linux内核如何检测是否修改了内存地址以实现COW? - How does the Linux kernel detect if a memory address was modified to implement COW? Linux内核:为什么调用kstrtol会崩溃? - Linux kernel: why does this call to kstrtol crash? Linux内核:获取内核驱动程序的功能地址 - Linux kernel: get function address for kernel driver 添加系统调用时地址错误 - Bad address when adding a system call Linux kernel 3.x-了解在内核空间中获取mac地址的方法 - Linux kernel 3.x - Understand the way to get mac address in kernel space Linux的内存布局(C中使用的malloc(),但不以期望的地址开头) - Memory Layout of Linux (malloc() used in C, but does not start with the expected address)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM