简体   繁体   English

.Net Windows服务中的StackOverflowException

[英]StackOverflowException in a .Net Windows Service

In some circumstances my .Net windows service can generate a StackOverflowException. 在某些情况下,我的.Net Windows服务可以生成StackOverflowException。 Unfortunately, the behaviour seems to be that the service simply stops dead and doesn't write anything into the event log. 不幸的是,该行为似乎是该服务停止运行并且没有在事件日志中写入任何内容。 I don't even get a message from the service control manager saying the service has failed. 我什至没有收到来自服务控制管理器的消息,说服务失败了。

is there any way at all a windows service can detect that such an exception has occurred? Windows服务有什么办法可以检测到已发生这种异常吗?

In the documentation for this exception, MSDN says "Note that an application that hosts the common language runtime (CLR) can specify that the CLR unload the application domain where the stack overflow exception occurs and let the corresponding process continue". 在有关此异常的文档中,MSDN说:“请注意,承载公共语言运行时(CLR)的应用程序可以指定CLR卸载发生堆栈溢出异常的应用程序域,并继续进行相应的处理”。 this is the kind of thing I would expect the windows service implementation to do, but it doesn't. 我希望Windows服务实现能够完成这种事情,但事实并非如此。

Please don't just reply saying I should make sure my code never ever throws such an exception - trust me, I would if I could - what I am trying to do is handle the worst case scenario in a sensible way and make my service resilient to unexpected errors. 请不要只是回答说我应该确保我的代码永远不会抛出这样的异常-相信我,如果可以的话,我会尝试-我想做的是以合理的方式处理最坏的情况并使我的服务具有弹性发生意外错误。

A stack overflow is about the worst kind of heart attack a thread can suffer. 堆栈溢出是线程可能遭受的最严重的心脏病发作。 It is so bad that you don't even get something in the event log. 太糟糕了,您甚至没有在事件日志中得到任何东西。 It is so bad that you can't even do anything reasonable to recover the state of your program. 太糟糕了,您甚至无法做任何合理的事情来恢复程序的状态。 The thread is dead and so is the state of the appdomain. 线程已死,appdomain的状态也已死。 It got mutated in completely unpredictable ways, you can only throw it away. 它以完全无法预测的方式发生了变异,您只能将其丢弃。

Well, you already know all that. 好吧,您已经知道所有这些。 But shrugging this off and pretending that it didn't happen causes a different kind of failure. 但是,忽略这一点并假装它没有发生会导致另一种失败。 A system failure, the service was supposed to do something and that didn't happen. 系统出现故障,该服务原本应该做某事,但是那没有发生。 There are not a lot of scenarios where that's acceptable. 没有很多可以接受的方案。 A file didn't get processed, a database update didn't happen, etcetera. 未处理文件,未发生数据库更新等。 The kind of mishap that can cause a chain of mishaps later on. 以后可能导致一系列事故的那种事故。 Like the CFO discovering that a million bucks is missing at the end of the year. 就像首席财务官发现在今年年底丢失一百万美元一样。

You didn't want to hear this but there is no sensible way to handle this. 您不想听到此消息,但是没有明智的方式来处理此问题。 Focus all of your efforts on finding the bug, not the band-aid. 将所有精力都放在发现错误上,而不是在创可贴上。 And stack overflow is always a programming bug. 堆栈溢出始终是编程错误。

Okay, a practical answer. 好的,一个实际的答案。 You are not stuck with a fixed size of the stack. 不会被固定大小的堆栈卡住。 You can use the Thread(ThreadStart, int) constructor to create one with a larger stack. 您可以使用Thread(ThreadStart,int)构造函数创建一个具有较大堆栈的构造函数。 Give it a couple of dozen megabytes. 给它几十兆字节。 This should go a large way to avoiding the problem if not completely solve it. 如果不能完全解决问题,这应该可以避免该问题。

Next thing to do is to start screening the xml file you are given to process. 接下来要做的是开始筛选要处理的xml文件。 Not so sure if it is the raw size of the file that would cause SO or bad data in the .xml. 不确定是不是导致xml中的SO或错误数据的文件原始大小。 Start by checking the size of the file and drop it in a separate directory if it is a monster. 首先检查文件的大小,如果它是怪兽,则将其放在单独的目录中。 To be processed manually, preferably by whomever created this file in the first place. 手动进行处理,最好由首先创建此文件的人员进行处理。 And make sure that you've got a couple of trouble-maker files, if you don't have them already. 如果您还没有麻烦文件,请确保您有几个。 Try to process them off-line with a monster thread stack size. 尝试使用怪异的线程堆栈大小脱机处理它们。 If that still blows, start looking for an algorithm that can pre-screen the .xml content to detect the source of the problem. 如果仍然无法解决问题,请开始寻找可以预筛选.xml内容以检测问题根源的算法。

Ask another question if you think that the .xml file content might be the cause and you need to find out what kind of bad content could cause this (don't know much of anything about xlt). 提出另一个问题,如果您认为.xml文件的内容可能是原因,那么您需要找出哪种不良内容可能导致此问题(对xlt的了解不多)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM