简体   繁体   English

调试“安全点”错误 - 需要理论上还是实用的来调试JVM崩溃?

[英]Debugging the “safepoint” error - need theoretical OR practical to debugging JVM crashes?

We have an amazingly elusive jvm crash occuring on an Ubuntu server that runs on AWS. 我们在AWS上运行的Ubuntu服务器上发生了令人难以置信的jvm崩溃。

  • Our JVM crashes while crawling a few web pages. 我们的JVM在抓取几个网页时崩溃了。

  • The crash occurs at line 308 of the "safepoint" cpp module. 崩溃发生在“安全点”cpp模块的第308行。 At the stage where a gauranteeArmed==0 statement occurs. 在发生gauranteeArmed == 0声明的阶段。

  • Our sysadmin has advised that , at the time of crashing, there are a massive amount of threads created by the JVM. 我们的系统管理员告知,在崩溃时,JVM会创建大量线程。

  • We have not reproduced this bug in other Linux or OSX boxes. 我们还没有在其他Linux或OSX盒子中重现这个bug。

  • We use the Ning library to crawl a few Web pages. 我们使用Ning库来抓取一些网页。

Related Posts 相关文章

In each of these posts a "safepoint" related crash which comes from "nowhere" was observed. 在这些帖子的每个帖子中都观察到“安全点”相关的崩溃,这种崩溃来自“无处”。 Most interestingly, the first above post actually exhibits a JVM crash during network related events. 最有趣的是,上面的第一篇文章实际上在网络相关事件中表现出JVM崩溃。

The cryptic nature of this bug leads me to believe that there is a bug related to thread creation and scheduling which is specific to our current version of Ubuntu with respect to the way java invokes some of its concurrency features, or some underlying library incompatibility that is highly idiosyncratic to our particular situation. 这个错误的神秘性使我相信有一个与线程创建和调度有关的错误,这个错误特定于我们当前版本的Ubuntu,关于java调用它的一些并发功能的方式,或者某些底层库不兼容对我们的特殊情况非常特殊。

My Question(s) 我的问题

My main question here is - what is the best method for debugging a JVM stack trace involving these "safepoints", and where can I get started learning about dealing with such errors ? 我的主要问题是 - 调试涉及这些“安全点”的JVM堆栈跟踪的最佳方法是什么,我在哪里可以开始学习如何处理这些错误? There have been other questions along this line, but I have not seen a generic answer . 在这一行上还有其他问题,但我没有看到一个通用的答案。

Secondary, any insight into aws, java, networking, and how Ubuntu might behave differently in the cloud would be useful here. 其次,对aws,java,网络以及Ubuntu在云中的行为方式有何不同的见解在这里很有用。

Try using the very latest JVM (6u32 or 7u4) and see if it's still reproducible. 尝试使用最新的JVM(6u32或7u4),看看它是否仍然可以重现。 If you are on an older version, there's at least a decent chance it's already been fixed in the latest. 如果您使用的是旧版本,那么至少有一个很好的机会,它已经在最新版本中得到修复。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM