简体   繁体   English

Linux 内核 filp_open 因 NOENT 失败

[英]Linux kernel filp_open fails with NOENT

UPDATE : in a continuation to @kisch's great answer, I read about softirq context, and it seems that (for a very reasonable reason) it is impossible to access user-mode from within this context.更新:在@kisch 的精彩回答的延续中,我阅读了 softirq 上下文,并且似乎(出于非常合理的原因)从该上下文中访问用户模式是不可能的。 I assume that this is indeed the reason why it failed.我认为这确实是它失败的原因。

currently work on a kernel module where I deal with user-space files.目前在处理用户空间文件的内核模块上工作。 I know it is considered a bad practice and all, but I still need.我知道这被认为是一种不好的做法,但我仍然需要。

The module places an hook using netfilter to catch every outgoing packet in the system, and while the hook is called - it calls filp_open .该模块使用 netfilter 放置一个钩子来捕获系统中的每个传出数据包,并且在调用该钩子时 - 它调用filp_open

The voodoo starts here.伏都教从这里开始。 When I send a ping from the loopback, everything works fine and the file ( /etc/fstab ) in this case is being opened successfully.当我从环回发送 ping 时,一切正常,并且在这种情况下文件 ( /etc/fstab ) 已成功打开。 When I ping the machine from a different IP in my house, filp_open fails with ENOENT .当我从家里的不同 IP ping 机器时, filp_open失败并显示ENOENT

To figure out where it actually fails, I ran the module on a QEMU emulation, successfully reproducing the weird behavior.为了找出它实际上失败的地方,我在 QEMU 仿真上运行了该模块,成功重现了奇怪的行为。 Apparently, it fails in the kernel inner function do_last , in the next code (taken from fs/namei.c ):显然,它在内核内部函数do_last失败,在下一个代码中(取自fs/namei.c ):

if (unlikely(d_is_negative(path.dentry))) {
        path_to_nameidata(&path, nd);
        return -ENOENT;
    }

I have absolutely no clue what makes it fail as the file existed the whole time.我完全不知道是什么导致它失败,因为该文件一直存在。

Anyone has any idea?任何人有任何想法?

This is the part in the code where it fails:这是代码中失败的部分:

unsigned int nf_sendfile_hook(void *priv,
                              struct sk_buff *skb,
                              const struct nf_hook_state *state)
{    
    if (NULL == g_get_payload_func) {
        // as long as we don't have a way to get our payloads, we don't 
        // have much to do.
        return NF_ACCEPT;
    }

    struct file *filp;
    filp = filp_open("/etc/fstab", O_RDONLY, 0);
    if (IS_ERR(filp)) {
        printk(KERN_ERR "%p\n", filp);
        return NF_ACCEPT;
    }
    
    ...
}

Thanks in advace.预先感谢。

I'd need some more details to be sure about it.我需要一些更多的细节来确定它。 Where exactly do you hook into?你到底在什么地方?

Very likely, the different behaviour is caused by the different context in which your hook function is called by the kernel.很可能,不同的行为是由内核调用钩子函数的不同上下文引起的。

When you当你

send a ping from the loopback,从环回发送 ping,

you have a userspace process issueing a sendmsg() syscall.您有一个发出 sendmsg() 系统调用的用户空间进程。 The kernel starts a callchain in a user context attached to that process.内核在附加到该进程的用户上下文中启动调用链。 The netfilter hook is likely called directly in that callchain, before the packet is put into a queue for further, detached processing.在将数据包放入队列以进行进一步分离处理之前,可能会直接在该调用链中调用 netfilter 挂钩。

When you当你

ping the machine from a different IP in my house,从我家的不同 IP ping 机器,

you have a callchain starting in the Soft-IRQ context of the NET_RX_SOFTIRQ , beginning in net_rx_action() .您有一个从 NET_RX_SOFTIRQ 的 Soft-IRQ 上下文开始的NET_RX_SOFTIRQ ,从NET_RX_SOFTIRQ net_rx_action()开始。 That callchain classifies the incoming packet as ICMP, passes it to the internal ICMP receive routine, which directly sends the ping reply packet, which then likely calls the netfilter hook.该调用链将传入的数据包分类为 ICMP,将其传递给内部 ICMP 接收例程,该例程直接发送 ping 回复数据包,然后可能调用 netfilter 挂钩。

The Soft-IRQ context has no relation to any userspace process. Soft-IRQ 上下文与任何用户空间进程无关。

Now depending on your kernel setup, it's entirely possible that the filesystem lookup code is dependent on the information present in a user context, to decide about access restrictions.现在,根据您的内核设置,文件系统查找代码完全有可能依赖于用户上下文中存在的信息来决定访问限制。 You might have mount namespaces, so without the process ID of a user context, the /etc filesystem might not even be mounted, which would explain the ENOENT.您可能有挂载命名空间,因此如果没有用户上下文的进程 ID,甚至可能不会挂载 /etc 文件系统,这可以解释 ENOENT。

It's also quite possible that the filesystem lookup code would need to call some operation which needs to schedule() , ie, block until a time-consuming operation completes (like paging in blocks of the underlying device of the filesystem for lookup).文件系统查找代码也很有可能需要调用一些需要schedule() ,即阻塞直到一个耗时的操作完成(比如在文件系统的底层设备的块中分页以进行查找)。 This wouldn't work from the SoftIRQ context.这在 SoftIRQ 上下文中不起作用。

This is not a complete answer yet, too many "likely", but I'm pretty sure it's the right direction where to find it.这还不是一个完整的答案,太多的“可能”,但我很确定这是找到它的正确方向。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM