简体   繁体   English

从用户空间,如何判断Linux的软看门狗是否配置了无路可走?

[英]From userspace, how can I tell if Linux's soft watchdog is configured with no way out?

I am writing a system monitor for Linux and want to include some watchdog functionality.我正在为 Linux 编写一个系统监视器,并希望包含一些看门狗功能。 In the kernel, you can configure the watchdog to keep going even if /dev/watchdog is closed.在内核中,您可以将看门狗配置为即使 /dev/watchdog 关闭也能继续运行。 In other words, if my daemon exits normally and closes /dev/watchdog, the system would still re-boot 59 seconds later.换句话说,如果我的守护进程正常退出并关闭 /dev/watchdog,系统仍会在 59 秒后重新启动。 That may or may not be desirable behavior for the user.这可能是也可能不是用户想要的行为。

I need to make my daemon aware of this setting because it will influence how I handle SIGINT.我需要让我的守护进程知道这个设置,因为它会影响我处理 SIGINT 的方式。 If the setting is on, my daemon would need to (preferably) start an orderly shutdown on exit or (at least) warn the user that the system is going to reboot shortly.如果设置打开,我的守护进程需要(最好)在退出时有序关闭或(至少)警告用户系统将很快重新启动。

Does anyone know of a method to obtain this setting from user space?有谁知道从用户空间获取此设置的方法? I don't see anything in sysconf() to get the value.我在 sysconf() 中看不到任何内容来获取该值。 Likewise, I need to be able to tell if the software watchdog is enabled to begin with.同样,我需要能够判断软件看门狗是否已启用。

Edit:编辑:

Linux provides a very simple watchdog interface. Linux 提供了一个非常简单的看门狗接口。 A process can open /dev/watchdog , once the device is opened, the kernel will begin a 60 second count down to reboot unless some data is written to that file, in which case the clock re-sets.一个进程可以打开 /dev/watchdog ,一旦设备打开,内核将开始倒计时 60 秒重新启动,除非一些数据写入该文件,在这种情况下时钟会重置。

Depending on how the kernel is configured, closing that file may or may not stop the countdown.根据内核的配置方式,关闭该文件可能会也可能不会停止倒计时。 From the documentation:从文档:

The watchdog can be stopped without causing a reboot if the device /dev/watchdog is closed correctly, unless your kernel is compiled with the CONFIG_WATCHDOG_NOWAYOUT option enabled.如果设备 /dev/watchdog 正确关闭,则可以停止看门狗而不会导致重新启动,除非您的内核是在启用 CONFIG_WATCHDOG_NOWAYOUT 选项的情况下编译的。

I need to be able to tell if CONFIG_WATCHDOG_NOWAYOUT was set from within a user space daemon, so that I can handle the shutdown of said daemon differently.我需要能够判断 CONFIG_WATCHDOG_NOWAYOUT 是否是从用户空间守护程序中设置的,以便我可以以不同的方式处理所述守护程序的关闭。 In other words, if that setting is high, a simple:换句话说,如果该设置很高,一个简单的:

# /etc/init.d/mydaemon stop

... would reboot the system in 59 seconds, because nothing is writing to /dev/watchdog any longer. ...将在 59 秒内重新启动系统,因为没有任何内容再写入 /dev/watchdog。 So, if its set high, my handler for SIGINT needs to do additional things (ie warn the user at the least).所以,如果它设置的很高,我的 SIGINT 处理程序需要做一些额外的事情(即至少警告用户)。

I can not find a way of obtaining this setting from user space :( Any help is appreciated.我找不到从用户空间获取此设置的方法:( 任何帮助表示赞赏。

AHA!啊哈! After digging through the kernel's linux/watchdog.h and drivers/watchdog/softdog.c , I was able to determine the capabilities of the softdog ioctl() interface.在深入研究内核的linux/watchdog.hdrivers/watchdog/softdog.c ,我能够确定 softdog ioctl()接口的功能。 Looking at the capabilities that it announces in struct watchdog_info :查看它在struct watchdog_info宣布的功能:

static struct watchdog_info ident = {
                .options =              WDIOF_SETTIMEOUT |
                                        WDIOF_KEEPALIVEPING |
                                        WDIOF_MAGICCLOSE,
                .firmware_version =     0,
                .identity =             "Software Watchdog",
        };

It does support a magic close that (seems to) override CONFIG_WATCHDOG_NOWAYOUT .确实支持(似乎)覆盖CONFIG_WATCHDOG_NOWAYOUT的魔术关闭。 So, when terminating normally, I have to write a single char 'V' to /dev/watchdog then close it, and the timer will stop counting.因此,在正常终止时,我必须向/dev/watchdog写入一个字符“V”,然后关闭它,计时器将停止计数。

A simple ioctl() on a file descriptor to /dev/watchdog asking WDIOC_GETSUPPORT allows one to determine if this flag is set.一个简单的ioctl()文件描述符到/dev/watchdog询问WDIOC_GETSUPPORT允许人们确定是否设置了此标志。 Pseudo code:伪代码:

int fd;
struct watchdog_info info;

fd = open("/dev/watchdog", O_WRONLY);
if (fd == -1) {
   perror("open");
   // abort, timer did not start - no additional concerns
}

if (ioctl(fd, WDIOC_GETSUPPORT, &info)) {
    perror("ioctl");
    // abort, but you probably started the timer! See below.
}

if (WDIOF_MAGICCLOSE & info.options) {
   printf("Watchdog supports magic close char\n");
   // You have started the timer here! Handle that appropriately.
}

When working with hardware watchdogs, you might want to open with O_NONBLOCK so ioctl() not open() blocks (hence detecting a busy card).使用硬件看门狗时,您可能希望使用O_NONBLOCK打开,因此ioctl()不会open()阻塞(因此检测到忙卡)。

If WDIOF_MAGICCLOSE is not supported, one should just assume that the soft watchdog is configured with NOWAYOUT.如果不支持WDIOF_MAGICCLOSE ,则应假设软看门狗配置为 NOWAYOUT。 Remember, just opening the device successfully starts the countdown.请记住,只需成功打开设备即可开始倒计时。 If all you're doing is probing to see if it supports magic close and it does, then magic close it .如果您所做的只是探索它是否支持 magic close 并且确实如此,那么magic close it Otherwise, be sure to deal with the fact that you now have a running watchdog.否则,一定要处理这样一个事实,即您现在有一个正在运行的看门狗。

Unfortunately, there's no real way to know for sure without actually starting it, at least not that I could find.不幸的是,没有真正启动它就没有真正的方法可以确定,至少我找不到。

a watchdog guards against hard-locking the system, either because of a software crash, or hardware failure.看门狗可防止由于软件崩溃或硬件故障而硬锁定系统。

what you need is a daemon monitoring daemon (dmd).你需要的是一个守护进程监控守护进程(dmd)。 check 'monit'检查“监控”

I think the watchdog device drivers are really intended for use on embedded platforms (or at least well controlled ones) where the developers will have control of which kernel is in use.我认为看门狗设备驱动程序真正用于嵌入式平台(或至少是控制良好的平台),开发人员可以在其中控制正在使用的内核。

This could be considered to be an oversight, but I think it is not.这可以被认为是一种疏忽,但我认为它不是。

One other thing you could try, if the watchdog was built as a loadable module, unloading it will presumably abort the shutdown?您可以尝试的另一件事是,如果看门狗是作为可加载模块构建的,卸载它可能会中止关闭?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 有什么办法从Linux用户空间向SD卡发送命令? - Any way to send commands to SD card from Linux userspace? 如何从用户空间暂停Linux中的i2c设备? - How do I suspend an i2c device in Linux from userspace? 如何从用户空间使用C在Linux中获取驱动器标签 - How to get drive label in Linux using C from userspace 如何从用户空间程序调用Linux内核驱动程序函数? - How to call Linux kernel driver functions from a userspace program? u-boot:如何从Linux用户空间访问“ bootcount”? - u-boot: how to access 'bootcount' from linux userspace? 如何判断文件是否在Linux上的其他位置打开? - How can I tell if a file is open elsewhere in C on Linux? 如何在Linux C的用户空间程序中找到sigset_t的补码 - How to find one's complement of sigset_t in userspace program in linux C 如何编写用户空间 linux 块设备驱动程序? - How to write a userspace linux block device driver? 如何在Linux上用C编写一个简单的WatchDog Timer? - How to write a simple WatchDog Timer in C on linux? 如何以编程方式查找为Linux中的特定网络设备配置的IP地址/网络掩码/网关? - How can I programmatically find the IP address/netmask/gateway configured for a specific network device in Linux?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM