简体   繁体   English

什么可能导致C程序崩溃操作系统

[英]What Can Cause a C Program to Crash Operating System

I recently found that a fairly large image manipulation program I'm writing in C on a Windows 8 machine has a bug when used in very particular circumstances. 我最近发现,在Windows 8机器上使用C语言编写的相当大的图像处理程序在非常特殊的情况下使用时会出现错误。 Unfortunately, the bug is causing my entire computer to come to a standstill so that my only option is to pull the plug on the computer (especially annoying when I'm working remotely...) 不幸的是,这个错误导致我的整个计算机陷入停顿状态,因此我唯一的选择是拔掉计算机上的插头(尤其是当我远程工作时很烦人...)

Because it's an image manipulation program, I can't just flood it with print statements to isolate the problematic section - the problem occurs somewhere in a loop that's called billions of times, so adding a printf slows it down to the point that it would take days to get to a failing iteration. 因为它是一个图像处理程序,我不能用打印语句来填充它以隔离有问题的部分 - 问题出现在一个被称为数十亿次的循环中,所以添加一个printf会使它减慢到需要它的程度。进入失败迭代的日子。

I understand, therefore, if this question is too broad, as it isn't really reasonable for me to put down all of the code that could cause my problem, I'm simply asking 因此,我理解,如果这个问题过于宽泛,因为我放下可能导致问题的所有代码并不合理,我只是问

What are the circumstances in which C code can, instead of seg faulting or halting the program, actually freeze the entire OS 在什么情况下,C代码可以实际冻结整个操作系统,而不是分段错误或暂停程序

When I search the problem, I see code golf questions like this 当我搜索问题时,我看到像这样的代码高尔夫问题

AC program which crashes the system(shuts down the system) 崩溃系统的AC程序(关闭系统)

This is not what I'm asking - obviously I haven't written system("shutdown") anywhere in my loop. 这不是我要问的 - 显然我没有在我的循环中的任何地方编写系统(“关闭”)

Being most familiar with python and java, this problem is not what I'm used to, but in my experience, 最熟悉python和java,这个问题不是我习惯的,但根据我的经验,

  • Dividing by zero produces a seg fault 除以零会产生seg故障
  • Accessing memory by accident that is slightly outside an intended array causes a seg fault (sometimes down the road a little) 意外访问内存稍微超出预期数组会导致seg故障(有时会在路上行驶一点)
  • Accessing protected memory causes the program to hang 访问受保护的内存会导致程序挂起
  • Stack overflow causes a seg fault 堆栈溢出导致seg错误
  • Dereferencing a non-initialized pointer causes a seg fault 取消引用未初始化的指针会导致seg错误

Is this impression false - could those cases cause the whole system to crash? 这种印象是否错误 - 这些情况会导致整个系统崩溃吗? What cases am I missing? 我错过了什么案例? Is it dependent on my version of gcc, or my permission status? 它取决于我的gcc版本,还是我的许可状态?

I haven't been able to try to reproduce it on a different operating system yet, as it requires a few dependencies to run the entire program. 我还没有尝试在不同的操作系统上重现它,因为它需要一些依赖来运行整个程序。

If my only option is to sit for days waiting for the program to run with print statements, or avoid weird situations, then, of course, so be it. 如果我唯一的选择是等待程序运行打印语句几天,或避免奇怪的情况,那么,当然,就这样吧。 I'm looking for key places to look for the bug. 我正在寻找寻找错误的关键位置。

On modern systems with hardware-enforced privilege separation between user-mode and kernel-mode, and an operating system that functions to correctly configure these mechanisms, you simply cannot crash the system from a user mode process. 在用户模式和内核模式之间具有硬件强制权限分离的现代系统上,以及用于正确配置这些机制的操作系统,您根本无法使系统从用户模式进程崩溃。

Any of those errors are trapped by the CPU, which call exception handlers in the OS which will quickly pull the plug on your system. 任何这些错误都会被CPU捕获,它会调用操作系统中的异常处理程序,这会快速拔出系统上的插件。

If I had to guess, a piece of hardware is overheating or malfunctioning: 如果我不得不猜测,一块硬件过热或出现故障:

  • Overheating CPU due to poor thermal conductivity with heatsink 由于散热片的导热性差导致CPU过热
  • Failing / under-sized power supply 电源故障/尺寸不足
  • Failing DIMMs 发生故障的DIMM
  • Failing hard drive 硬盘故障
  • Failing CPU 失败的CPU
  • Failing / overheating GPU GPU失败/过热

I've seen cryptocoin-mining software bring a system to its knees because it was pushing the limits of the GPU. 我见过,cryptocoin-mining软件让系统瘫痪,因为它正在推动GPU的极限。 When the card would lock-up/reset, the driver would get confused or lock-up, and the system would end up needed rebooted. 当卡锁定/重置时,驱动程序会混淆或锁定,系统最终需要重新启动。

Your system is doing next to nothing when you're just sitting there browsing the web, etc. But if your system locks up when you start running a CPU-intensive application, it can bring out problems that you didn't know where there. 当你只是坐在那里浏览网络等时,你的系统几乎什么也没做。但是如果你的系统在你开始运行CPU密集型应用程序时锁定,它会带来你不知道在那里的问题。

While this is a little out-of-place on Stack Overflow, it falls into one of those grey areas between hardware and software. 虽然这在Stack Overflow上有点不合适,但它属于硬件和软件之间的灰色区域之一。 I would stress-test your system, keeping an eye on CPU/GPU/memory temperatures, and power supply voltages. 我会对您的系统进行压力测试,密切关注CPU / GPU /内存温度和电源电压。 Check out MemTest86 , Stresslinux . 查看MemTest86Stresslinux

The most trivial cause of OS freezing is "memory full". 操作系统冻结的最微不足道的原因是“内存已满”。 If you have processes that use a lot of memory, then your system is going to swap from main memory (typically RAM) to secondary memory (typically disk) which lead to a very huge overhead... As a user what you usually observe is a almost freezed computer, sometimes so freezed that you think it is crashed. 如果您的进程使用大量内存,那么您的系统将从主内存(通常是RAM)交换到辅助内存(通常是磁盘),这会导致非常大的开销......作为用户,您通常会观察到的是一台几乎冻结的计算机,有时如此冻结,以至于你认为它已经崩溃了。 If your OS is badly designed then it sometimes crashes! 如果您的操作系统设计糟糕,那么它有时会崩溃!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM