简体   繁体   English

这是Linux内核崩溃吗? 我该如何解决?

[英]Is this a Linux Kernel Crash? How do I resolve it?

We are testing a firewall application running on embedded linux. 我们正在测试在嵌入式Linux上运行的防火墙应用程序。 At a certain point during testing, the linux hangs(freezes) and we see the following on the console: 在测试过程中的某个时刻,Linux挂起(冻结),我们在控制台上看到以下内容:

TCHDOG: eth0 (fsl-gianfar): transmit queue 0 timed out
------------[ cut here ]------------ WARNING: at net/sched/sch_generic.c:279 Modules linked in: CPU: 0 PID: 0 Comm:
swapper/0 Not tainted 3.12.19-rt30-gc29fe1a #27 task: c08f9300 ti:
effea000 task.ti: c093a000 NIP: c052a98c LR: c052a98c CTR: c0327948
REGS: effebe60 TRAP: 0700   Not tainted  (3.12.19-rt30-gc29fe1a) MSR:
00029000 <CE,EE,ME>  CR: 44044022  XER: 20000000

GPR00: c052a98c effebf10 c08f9300 0000003f c128c484 c128c9d0 c0328b54
00021000  GPR08: 00000001 00000001 0099b000 00000312 24044024 0f003103
effea000 c07f6f28  GPR16: 00000100 00200000 c0940000 c08f0000 001631a5
00000000 000000a4 ffffffff  GPR24: 00000000 00000000 effea000 00000004
c0940000 c0940000 c74d0000 00000000  NIP [c052a98c]
dev_watchdog+0x2dc/0x2ec LR [c052a98c] dev_watchdog+0x2dc/0x2ec Call
Trace: [effebf10] [c052a98c] dev_watchdog+0x2dc/0x2ec (unreliable)
[effebf40] [c005194c] call_timer_fn.isra.29+0x28/0x84 [effebf60]
[c0051b28] run_timer_softirq+0x180/0x1fc [effebfa0] [c004a5e8]
__do_softirq+0x100/0x1cc [effebff0] [c000d6e8] call_do_softirq+0x24/0x3c [c093be60] [c0004920] do_softirq+0x90/0xb8
[c093be80] [c004afb4] irq_exit+0xa4/0xc8 [c093be90] [c0009c10]
timer_interrupt+0x1a4/0x1d0 [c093bec0] [c000f594]
ret_from_except+0x0/0x18
--- Exception: 901 at arch_cpu_idle+0x24/0x5c
    LR = arch_cpu_idle+0x24/0x5c [c093bf80] [c00ac4ec] rcu_idle_enter+0xac/0xec (unreliable) [c093bf90] [c0086b00]
cpu_startup_entry+0x120/0x170 [c093bfc0] [c08a97a8]
start_kernel+0x2f0/0x304 [c093bff0] [c00003fc] skpinv+0x2e8/0x324
Instruction dump: 4e800421 80fe0204 4bffff44 7fc3f378 4bfe72e5
7fc4f378 7c651b78 3c60c085  7fe6fb78 38632bf0 4cc63182 48184835
<0fe00000> 39200001 993c9c37 4bffffb4 
---[ end trace d3f58d6e7db83823 ]---

Is it a kernel crash? 它是内核崩溃吗? What caused it? 是什么原因造成的? How do I resolve it? 我该如何解决? Please let me know if you need any other information. 如果您需要其他任何信息,请告诉我。

No, it isn't a kernel crash. 不,这不是内核崩溃。

It's a warning notification from an internal watchdog timer that watches over the transmit work of the Freescale Gianfar Ethernet driver. 这是来自内部看门狗计时器的警告通知,该计时器监视Freescale Gianfar以太网驱动程序的传输工作。

The message means the drivers has queued a frame(s) for transmission and timeout getting a transmit confirmation interrupt (or other indication) from the Ginafar hardware that they were transmitted. 该消息表示驱动程序已将一个帧排队等待传输和超时,并从Ginafar硬件获取了传输确认的传输确认中断(或其他指示)。

This may be a driver issue - but it can very well be a hardware issue (eg Ethernet MAC getting stuck). 这可能是驱动程序问题-但很可能是硬件问题(例如,以太网MAC卡住了)。

BTW, the content of the message says your system was not doing anything (being idle) at the time the watchdog timer happened. 顺便说一句,消息内容表明,在看门狗定时器发生时,您的系统未执行任何操作(处于空闲状态)。

Since we/I don't know, what exactly you're doing without digging in your code. 既然我们/我不知道,那么您在不挖掘代码的情况下到底在做什么。 However, here's a try to analyze it a little ;) 但是,这里尝试稍微分析一下;)

The line WARNING: at net/sched/sch_generic.c:279 Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.19-rt30-gc29fe1a shows next to some kernel data ( not tainted means that you didn't load closed-source drivers) that the crash occured here . WARNING: at net/sched/sch_generic.c:279 Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.19-rt30-gc29fe1aWARNING: at net/sched/sch_generic.c:279 Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.19-rt30-gc29fe1a在某些内核数据旁边显示( not tainted意味着您在没有加载封闭源代码驱动程序),崩溃发生在这里 The stack trace verifys that this is the cause, too. 堆栈跟踪也验证了这也是原因。 While this line isn't too helpful per-se (for me, I'm not into the kernel source), it shows that the net scheduler failed. 尽管该行本身并不太有用(对我来说,我不是内核源代码),但它表明网络调度程序失败了。 If your firewall somehow messed with it, you should start to search there. 如果您的防火墙以某种方式使其混乱,则应开始在此处搜索。

If not, you may have encountered an actual kernel bug. 如果不是,则可能是遇到了实际的内核错误。 The first thing to do is updating your version, if possible. 如果可能的话,第一件事就是更新您的版本。 There is 3.19 and 4.1 available as of writing. 撰写本文时有3.19和4.1。 If this doesn't help (or you really need this version) you can file a kernel bug . 如果这没有帮助(或者您确实需要此版本),则可以提交内核错误 Since your kernel isn't tainted, you can expect help from the devs. 由于您的内核没有受到污染,因此您可以期望开发人员的帮助。 Good luck :) 祝好运 :)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM