简体   繁体   English

如何检测程序陷入无限循环的位置?

[英]How do I detect where the program is stuck in an infinite loop?

I am working on a (relatively complex) game.我正在开发一个(相对复杂的)游戏。 The game freezes in release mode.游戏在发布模式下冻结。 The freeze happens after 1-2 min.冻结发生在 1-2 分钟后。 of game-play.的游戏。 The current configuration of the release mode that I have allows me to break (that is go into debug), which is good, but may give me wrong information but that is fine for this particular case (I can turn off the optimization for a single file/function/code).我拥有的发布模式的当前配置允许我中断(即 go 进入调试),这很好,但可能会给我错误的信息,但对于这种特殊情况来说很好(我可以关闭单个优化文件/功能/代码)。

Problem is, I (we, since we are a team) don't know where it is hanging.问题是,我(我们,因为我们是一个团队)不知道它挂在哪里。 It is not as simple as one relatively small infinite loop that is hanging, as other things (Graphics, sound) are being updated, just that the game-play has stalled.它并不像一个相对较小的无限循环挂起那样简单,因为其他东西(图形,声音)正在更新,只是游戏停止了。 The main game loop (an infinite loop) is always running and is very long/complex, so stepping through is going to be a pain (but it is one of the options).主游戏循环(无限循环)始终在运行并且非常长/复杂,因此单步执行会很痛苦(但它是其中一种选择)。

The first thing I tried is Visual Studio's break all but it always breaks in code that is not mine and consequently shows me assembly output.我尝试的第一件事是 Visual Studio 的break all中断,但它总是中断不是我的代码,因此向我展示了汇编 output。 Eventually, with enough persistence, SVN history checking and commenting out code I will be able to figure out where it is hanging, but there has to be a better way... hopefully?最终,通过足够的持久性,SVN 历史检查和注释代码我将能够找出它挂在哪里,但必须有更好的方法......希望?

Note: There is a Visual Studio option I am aware of that allows debugging user code only , but that is managed code only.注意:我知道有一个 Visual Studio 选项只允许调试用户代码,但这只是托管代码。

EDIT: Was able to solve the problem via stack trace and lots of hours of keeping track of various things to see where the game is hanging .编辑:能够通过堆栈跟踪和大量时间跟踪各种事物以查看游戏挂起的位置来解决问题。 I will select Sjoerd's answer as the correct one, however, if someone has a suggestion for a tool/technique that allows to automate such a task, by all means, add your answer!我将 select Sjoerd 的答案作为正确答案,但是,如果有人对允许自动执行此类任务的工具/技术提出建议,请务必添加您的答案!

If you break and you encounter native code that is not yours, check the call stack .如果您中断并遇到不属于您的本机代码,请检查调用堆栈 The call stack is the list of functions that got called to reach the current point in the code.调用堆栈是为到达代码中的当前点而调用的函数列表。 Go up some levels in the stack until you encounter the method which is currently running. Go 在堆栈中向上一些级别,直到遇到当前正在运行的方法。

As an alternative to debugging symbols and breaks (which is the tool of choice when possible), add logging: It is not uncommon for games (and other apps) to have a huge logging system they can turn on and off with a compiler flag so they can still do some kind of debugging/tracing in "release builds".作为调试符号和中断(如果可能,这是首选工具)的替代方法,添加日志记录:游戏(和其他应用程序)拥有庞大的日志记录系统并不少见,它们可以通过编译器标志打开和关闭,因此他们仍然可以在“发布版本”中进行某种调试/跟踪。 If your logging works fine you should see what is and what is not happening and get at least some idea where things go wrong.如果您的日志记录工作正常,您应该了解发生了什么和没有发生什么,并至少了解 go 错误的地方。

Hit the pause button in Visual Studio while the program is hung.在程序挂起时点击 Visual Studio 中的暂停按钮。

This should break the debugger at the current line.这应该会在当前行中断调试器。 You can then step through and see what is happening.然后,您可以单步执行并查看发生了什么。

You might well never be able to catch the problem via an interrupt if the code that should be executing isn't executing.如果应该执行的代码没有执行,您很可能永远无法通过中断捕获问题。 There are lots of ways this can happen.有很多方法可以发生这种情况。 Just a few:一些:

  • You have some parameter that indicates the time at which the next update is to be performed.您有一些参数指示执行下一次更新的时间。 If this somehow gets set to some big number, the code that does the update will happily see that nothing needs to be done.如果以某种方式将其设置为某个大数字,则执行更新的代码将很高兴地看到无需执行任何操作。 Next.下一个。 This can give all the appearances of a hung program even though it isn't really hung at all.这可以给出挂起程序的所有外观,即使它根本没有真正挂起。 The state update and the graphics functions are still being called at their prescribed rate. state 更新和图形功能仍在以规定的速度被调用。

  • You may some counter that represents time and some rounding mechanism for incrementing time.您可能会使用一些表示时间的计数器和一些用于增加时间的舍入机制。 If the counter is a 32 bit signed int and the granularity of your counter is 0.1 microseconds, you will hit INT32_MAX after just 3.6 minutes.如果计数器是 32 位有符号整数并且计数器的粒度是 0.1 微秒,那么您将在 3.6 分钟后达到 INT32_MAX。 Now time is frozen, so once again you have a situation where updates may not be performed.现在时间已冻结,因此您再次遇到可能无法执行更新的情况。

  • You are using a single precision floating point number to represent time and update time via time += delta_t;您正在使用单精度浮点数通过time += delta_t; This will stop working after a couple of minutes if your delta_t is 10 microseconds.如果您的delta_t为 10 微秒,这将在几分钟后停止工作。 This is yet another mechanism by which time can be frozen.这是另一种可以冻结时间的机制。

Edit编辑
Have you looked at the CPU usage in your various threads?您是否查看过各个线程中的 CPU 使用率? The above problems might cause the physics or game-playing thread to exhibit a drastic drop in CPU usage after a couple of minutes.上述问题可能会导致物理或游戏线程在几分钟后表现出 CPU 使用率的急剧下降。 You might also get this behavior if the game playing thread is perpetually locked, but here you might (with the right tool) get an indication that that thread is always asleep.如果玩游戏的线程被永久锁定,您也可能会遇到这种行为,但在这里您可能(使用正确的工具)得到该线程始终处于睡眠状态的指示。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM