简体   繁体   English

Windows服务需要监控错误日志记录吗? 想法?

[英]Windows Service need to monitor error logging? Ideas?

I have a windows service that runs 24/7 on one of our servers. 我有一个在我们的服务器上全天候运行的Windows服务。

It connects to an external company as of late that company has been going down a lot. 它最近连接到一家外部公司,该公司已经下降了很多。

I need to set something up that will essentially monitor when we have had 25 errors in the last minute within the error logs for this service. 我需要设置一些内容,这将基本上监视我们在此服务的错误日志中的最后一分钟有25个错误。

I am guessing that I will have to create a table and insert these errors into the table as they are being logged and then set up something that checks via a tsql query whether or not 25 have occurred in the last minute? 我猜我必须创建一个表并将这些错误插入到表中,因为它们被记录,然后设置一些通过tsql查询检查是否在最后一分钟发生了25个? (then send out an email or update a dashboard monitoring page for support) (然后发送电子邮件或更新仪表板监控页面以获得支持)

Really my question is does anyone have a better idea than this? 真的我的问题是,有没有人有比这更好的想法? Someone must have done something better than this in the past. 过去,某人必须做得比这更好。 I guess I have never attempted to read straight from logs. 我想我从未试图直接从日志中读取。 Maybe that would be a better route. 也许那将是一条更好的路线。

Any ideas direction are greatly appreciated on this one. 任何想法的方向都非常感谢这一点。 Thanks. 谢谢。

I have a similar problem with an external web API that my Windows Service calls periodically. 我的Windows服务定期调用的外部Web API也存在类似的问题。

My solution was to just use NLog to write errors to a text log file, and keep a counter in the service itself of the number of failures without a success. 我的解决方案是只使用NLog将错误写入文本日志文件,并在服务本身保留一个失败次数的计数器而不成功。 If the counter exceeds a configurable threshold, I write a Critical entry to NLog rather than an Error entry, and configure NLog to email an alias that several folks on the operations team get when there is a Critical event. 如果计数器超过可配置的阈值,我会向NLog写一个Critical条目而不是Error条目,并将NLog配置为通过电子邮件发送操作团队中有几个人在发生Critical事件时获得的别名。

If you need to strictly implement the "25 errors in the last minute" semantic, you could write errors to an in-memory constrained (to a max of 25 items) queue. 如果您需要严格执行“最后一分钟内的25个错误”语义,则可以将错误写入内存中约束(最多25个项目)队列。 If the queue length gets to 25, check if the first item in the queue is within the last minute. 如果队列长度达到25,请检查队列中的第一项是否在最后一分钟内。 If so, write a Critical error to the log. 如果是这样,请将严重错误写入日志。

Logging is fun. 记录很有趣。 :/ :/

Your options are essentially: 你的选择基本上是:

  1. Log to a database server - Advantage: easy to read from other locations. 登录数据库服务器 - 优点:易于从其他位置读取。 Disadvantage: you need a database server. 缺点:您需要一个数据库服务器。 If the project doesn't already include one, might be a pain. 如果项目还没有包含一个,可能会很痛苦。 Also, logging fails if problem is in network connectivity. 此外,如果网络连接出现问题,则记录失败。

  2. Log to the Event Log - Advantage: fast to write locally. 登录事件日志 - 优势:快速在本地写入。 Can be read remotely.. with the correct user permissions. 可以使用正确的用户权限远程读取.. Disadvantage: You'll be querying this a lot and the event log isn't exactly built for that. 缺点:您将对此进行大量查询,并且事件日志并非完全针对此构建。

  3. Log to a file - Advantage: extremely fast write. 记录到文件 - 优点:极快写入。 Disadvantage: Requires a lot of permission setup for remote code to access. 缺点:需要许多权限设置才能访问远程代码。 May be corrupted/lost/deleted etc. 可能已损坏/丢失/删除等

  4. Use additional software such as System Center Operations Manager. 使用其他软件,例如System Center Operations Manager。 Advantage: this is exactly the type of thing that was built for. 优点:这正是为此而构建的类型。 Disadvantage: cost/setup. 缺点:成本/设置。


Those are in my order of preference. 这些是我的偏好顺序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM