简体   繁体   English

filp_open 在字符设备驱动程序的释放功能中崩溃

[英]filp_open crashes in release function of char device driver

I'm writing a device driver, and I need to do many operations when a file is closed, and one of these operations is opening another file.我正在写一个设备驱动程序,当一个文件关闭时我需要做很多操作,其中一个操作是打开另一个文件。 Everything works perfectly if the user runs open, close and return.如果用户运行打开、关闭和返回,一切都会完美运行。 But if the user only runs open and then returns, my driver crashes, and I've noticed that it crashes just when it tries to do the filp_open in my release function.但是如果用户只运行 open 然后返回,我的驱动程序就会崩溃,我注意到它在我的 release 函数中尝试执行 filp_open 时崩溃。 I had the impression that when the release function is invoked not directly by the user (via a close) but directly by the kernel (because the user makes a return without the close), I can't do a filp_open.我的印象是,当 release 函数不是由用户直接调用(通过关闭)而是由内核直接调用(因为用户在没有关闭的情况下返回)时,我无法执行 filp_open。 (Obviously the path is always correct, because it works when the user try to do open, close and return in this order). (显然路径总是正确的,因为当用户尝试按此顺序打开、关闭和返回时它会起作用)。

This is the code that causes the crash during my releases method:这是在我的发布方法期间导致崩溃的代码:

struct file *file_open(const char *path, int flags, int rights)
{
    if(path==NULL){
        print_message("path is NULL\n");
        return NULL;
    }
    struct file *filp = NULL;
    mm_segment_t oldfs;
    int err = 0;

    oldfs = get_fs();
    set_fs(get_ds());
    print_message("I'm doing filp_open");

    filp = filp_open(path, flags, rights);
    set_fs(oldfs);
    if (IS_ERR(filp)) {
        print_message("error during filp_open\n");
        err = PTR_ERR(filp);
        return NULL;
    }
    if(filp==NULL){
        print_message("filp is NULL\n");
        return NULL;
    }
    return filp;
}

and this is the dump of the kernel when I execute dmesg:这是我执行 dmesg 时内核的转储:

[  961.870540] I'm doing filp_open
[  961.870548] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[  961.870550] PGD 0 P4D 0
[  961.870554] Oops: 0000 [#1] SMP PTI
[  961.870556] CPU: 1 PID: 2315 Comm: userspace Tainted: G           OE     4.18.0-25-generic #26~18.04.1-Ubuntu
[  961.870558] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  961.870570] RIP: 0010:set_root+0x26/0xc0
[  961.870571] Code: 1f 44 00 00 0f 1f 44 00 00 55 65 48 8b 04 25 00 5c 01 00 48 89 e5 41 55 41 54 41 52 53 f6 47 38 40 4c 8b a0 88 0a 00 00 74 3d <41> 8b 4c 24 08 f6 c1 01 75 7c 49 8b 54 24 20 49 8b 44 24 18 48 89
[  961.870596] RSP: 0018:ffffbd9bc33dbaa8 EFLAGS: 00010202
[  961.870598] RAX: ffff9a8f72384500 RBX: ffffbd9bc33dbbf0 RCX: 0000000000000001
[  961.870599] RDX: ffffffff8fef34c8 RSI: 0000000000000041 RDI: ffffbd9bc33dbbf0
[  961.870600] RBP: ffffbd9bc33dbac8 R08: ffff9a8fbfd27080 R09: ffff9a8fb474c600
[  961.870602] R10: ffffbd9bc33dba98 R11: 00000000ffffffff R12: 0000000000000000
[  961.870603] R13: ffffbd9bc33dbbf0 R14: 0000000000000001 R15: 0000000000000002
[  961.870605] FS:  0000000000000000(0000) GS:ffff9a8fbfd00000(0000) knlGS:0000000000000000
[  961.870606] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  961.870607] CR2: 0000000000000008 CR3: 0000000050e0a002 CR4: 00000000000606e0
[  961.870610] Call Trace:
[  961.870615]  path_init+0x16f/0x2f0
[  961.870617]  path_openat+0x78/0x1780
[  961.870621]  ? sched_clock+0x9/0x10
[  961.870626]  ? sched_clock_cpu+0x11/0xb0
[  961.870628]  do_filp_open+0x9b/0x110
[  961.870633]  ? vprintk_emit+0xec/0x290
[  961.870636]  file_open_name+0x114/0x180
[  961.870638]  ? file_open_name+0x114/0x180
[  961.870640]  filp_open+0x33/0x60
[  961.870643]  file_open+0x56/0x90 [driver]
[  961.870645]  my_char_device_driver_close+0x96/0x190 [driver]
[  961.870647]  __fput+0xea/0x220
[  961.870649]  ____fput+0xe/0x10
[  961.870652]  task_work_run+0x9d/0xc0
[  961.870655]  do_exit+0x2eb/0xb30
[  961.870658]  ? __do_page_fault+0x270/0x4d0
[  961.870660]  do_group_exit+0x43/0xb0
[  961.870662]  __x64_sys_exit_group+0x18/0x20
[  961.870666]  do_syscall_64+0x5a/0x120
[  961.870672]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  961.870674] RIP: 0033:0x7f0304ee3e06
[  961.870675] Code: Bad RIP value.
[  961.870679] RSP: 002b:00007fff99afb5c8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[  961.870680] RAX: ffffffffffffffda RBX: 00007f03051e6740 RCX: 00007f0304ee3e06
[  961.870682] RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
[  961.870686] RBP: 0000000000000000 R08: 00000000000000e7 R09: ffffffffffffff80
[  961.870688] R10: 0000000000000002 R11: 0000000000000246 R12: 00007f03051e6740
[  961.870689] R13: 0000000000000001 R14: 00007f03051ef628 R15: 0000000000000000
[  961.870691] Modules linked in: driver(OE) vboxvideo(OE) vboxsf(OE) snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm crct10dif_pclmul crc32_pclmul snd_seq_midi snd_seq_midi_event ghash_clmulni_intel joydev pcbc snd_rawmidi aesni_intel aes_x86_64 crypto_simd cryptd glue_helper snd_seq intel_rapl_perf input_leds snd_seq_device snd_timer serio_raw snd soundcore mac_hid vboxguest(OE) sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid psmouse video vmwgfx ttm ahci drm_kms_helper libahci syscopyarea e1000 sysfillrect i2c_piix4 sysimgblt fb_sys_fops pata_acpi drm
[  961.870724] CR2: 0000000000000008
[  961.870726] ---[ end trace 3fb99e3beca99ccc ]---
[  961.870728] RIP: 0010:set_root+0x26/0xc0
[  961.870729] Code: 1f 44 00 00 0f 1f 44 00 00 55 65 48 8b 04 25 00 5c 01 00 48 89 e5 41 55 41 54 41 52 53 f6 47 38 40 4c 8b a0 88 0a 00 00 74 3d <41> 8b 4c 24 08 f6 c1 01 75 7c 49 8b 54 24 20 49 8b 44 24 18 48 89
[  961.870753] RSP: 0018:ffffbd9bc33dbaa8 EFLAGS: 00010202
[  961.870755] RAX: ffff9a8f72384500 RBX: ffffbd9bc33dbbf0 RCX: 0000000000000001
[  961.870756] RDX: ffffffff8fef34c8 RSI: 0000000000000041 RDI: ffffbd9bc33dbbf0
[  961.870757] RBP: ffffbd9bc33dbac8 R08: ffff9a8fbfd27080 R09: ffff9a8fb474c600
[  961.870759] R10: ffffbd9bc33dba98 R11: 00000000ffffffff R12: 0000000000000000
[  961.870760] R13: ffffbd9bc33dbbf0 R14: 0000000000000001 R15: 0000000000000002
[  961.870761] FS:  0000000000000000(0000) GS:ffff9a8fbfd00000(0000) knlGS:0000000000000000
[  961.870763] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  961.870764] CR2: 00007f0304ee3ddc CR3: 0000000050e0a002 CR4: 00000000000606e0
[  961.870765] Fixing recursive fault but reboot is needed!

I think you are correct that the crash is a result of filp_open() being called from the file "release" handler during process exit.我认为您是正确的,崩溃是在进程退出期间从文件“释放”处理程序调用filp_open()的结果。

The direct cause of the crash seems to be due to current->fs being NULL when the set_root() function is called during the call to filp_open() .崩溃的直接原因似乎是由于在调用filp_open()期间调用set_root()函数时current->fsNULL

The do_exit() function calls exit_files() to close the open files, but the files' "release" handlers are not called immediately. do_exit()函数调用exit_files()来关闭打开的文件,但不会立即调用文件的“释放”处理程序。 Work items are queued on the current task in order to call the "release" handlers later.工作项在当前任务上排队,以便稍后调用“发布”处理程序。

The do_exit() function then calls exit_fs() which destroys current->fs and sets current->fs to NULL . do_exit()函数然后调用exit_fs()销毁current->fs并将current->fsNULL

A bit further on, do_exit() calls exit_task_work() which will run the previously queued work items (and prevent any more work items being added).更进一步, do_exit()调用exit_task_work() ,它将运行先前排队的工作项(并防止添加更多工作项)。 This results in the files' "release" handlers being called.这会导致调用文件的“发布”处理程序。

The upshot is that current->fs will be valid in the "release" handler when a file is closed normally, but current->fs will be NULL in the "release" handler when a file is closed on task exit.其结果是, current->fs将在“放”的处理程序是有效的,当一个文件被正常关闭,但current->fs将是NULL当文件在任务退出关闭在“放”的处理程序。 filp_open() crashes when current->fs is NULL , so you should avoid calling it from a "release" handler, or at least check that current->fs is non-NULL before you call filp_open() from a "release" handler. filp_open()current->fsNULL时崩溃,因此您应该避免从“发布”处理程序调用它,或者至少在从“发布”处理程序调用filp_open()之前检查current->fs是否为非 NULL .

A possible work-around may be to call filp_open() from a kernel thread or call it via a work item queued on the system work queue by a call to schedule_work() .一种可能的解决方法是从内核线程调用filp_open()或通过调用schedule_work()通过在系统工作队列中排队的工作项来调用它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM