简体   繁体   English

如果数据库通过Castle Windsor和nhibernate工具脱机,如何重试Windows服务启动?

[英]How to retry windows service startup if db is offline with castle windsor and nhibernate facility?

Problem: If the DB is offline when this service is started, this service will not start as it fails inside this line: var container = new BootStrapper().Container; 问题:如果启动此服务时DB处于脱机状态,则该服务将无法启动,因为它在此行中失败: var container = new BootStrapper().Container; on start. 在开始。

private static void Main(string[] args)
{
    Logger.Info("Engine Service is bootstrapping...");
    AppDomain.CurrentDomain.UnhandledException += UncaughtExceptions.DomainException;
    Directory.SetCurrentDirectory(AppDomain.CurrentDomain.BaseDirectory);

    var container = new BootStrapper().Container;
    var controller = container.Resolve<EngineController>();
    ServiceBase.Run(controller.MainView as ServiceBase);

    container.Dispose();
}

The reason it fails there is that it runs this code where it adds the nhibernate facility container.AddFacility<NHibernateFacility>(); 失败的原因是它在添​​加nhibernate设施container.AddFacility<NHibernateFacility>();地方运行此代码。AddFacility container.AddFacility<NHibernateFacility>(); and fails with a connection timeout. 并因连接超时而失败。

public void Install(IWindsorContainer container, IConfigurationStore store)
{
    var isAutoTxFacilityRegistered = container.Kernel.GetFacilities().Any(f => f is AutoTxFacility);
    if (!isAutoTxFacilityRegistered) container.AddFacility<AutoTxFacility>();

    container.Register(
        Component.For<INHibernateInstaller>().ImplementedBy<CieFluentInstaller>().IsDefault().LifestyleTransient(),
        Classes.FromThisAssembly().Pick().WithService.DefaultInterfaces().LifestyleTransient()
        );

    var isNHibernateFacilityRegistered = container.Kernel.GetFacilities().Any(f => f is NHibernateFacility);
    if (!isNHibernateFacilityRegistered) container.AddFacility<NHibernateFacility>();
}

If the windows service start up takes longer than 30 seconds (which it may if updates or backups are being conducted on the DB) the app service fails to start. 如果Windows服务启动时间超过30秒(如果正在数据库上进行更新或备份,则可能会超过30秒),则应用程序服务将无法启动。

I'm using FluentNhibernate, NHibernate, Castle Windsor with NHibernateFacility. 我正在使用NHibernateFacility的FluentNhibernate,NHibernate,Castle Windsor。

Things I've tried: 我尝试过的事情:

  • Can't do it from the service start event because it fails before it gets to the view or controller. 无法通过服务启动事件来执行此操作,因为它在到达视图或控制器之前会失败。 The view and controller have no direct access to the IoC container, only via an injected IoCFactory as per Castle Windsor recommendations. 视图和控制器无法直接访问IoC容器,只能按照Castle Windsor的建议通过注入的IoCFactory进行访问。

  • I've tried to spawn a thread in the main and start it off there with a retry loop but because the service "waits" inside the ServiceBase.Run method, I can't seem to get the correct returns to make it "fake start" while in a retry loop. 我试图在主线程中生成一个线程,并使用重试循环从那里启动它,但是由于该服务在ServiceBase.Run方法内部“等待”,因此我似乎无法获得正确的返回值来使其成为“假启动”在重试循环中。

  • Investigated lengthening the service start timeout, but can't access the servicebase/view since it fails before then and a system wide change at hundreds of production sites is not an option. 已调查延长服务启动超时的时间,但无法访问服务库/视图,因为它在此之前失败,并且无法在数百个生产站点进行系统范围的更改。

Question: How can I make it so that the windows service "starts" when DB is offline given the design? 问题:在设计使DB脱机时,如何使Windows服务“启动”?

You need to divide your startup actions into two categories: 您需要将启动操作分为两类:

  1. Actions that must happen fairly immediately and/or won't fix themselves in case of failure. 必须立即发生的操作和/或在失败的情况下不会自行修复的操作。 Things such as a mandatory configuration file missing, for which administrator intervention would be required. 缺少诸如强制性配置文件之类的东西,需要管理员的干预。

  2. Actions that we're OK to delay, or - more importantly - actions that can fail due to transient errors. 我们可以延迟的操作,或者-更重要的是-由于暂时性错误而可能失败的操作。 Such errors can be network failure or that we happened to start somewhat faster than the database server after a reboot. 此类错误可能是网络故障,也可能是重启后我们启动的速度比数据库服务器快。

You service OnStart code should follow this basic structure: 您服务的OnStart代码应遵循以下基本结构:

OnStart:
    Perform the immediate category 1 tasks and exit if any of these fail.
    Launch the main application thread.

One approach to the "main application thread" is to follow this basic structure: “主应用程序线程”的一种方法是遵循以下基本结构:

ManualResetEvent shutdownRequestedEvent = new ManualResetEvent()

RealMain:
    while (!shutdownRequestedEvent.WaitOne(0) && !bootstrapPerformed)
    {
        try
        {
            PerformBootstrap()
            bootstrapPerformed = true
        }
        catch (Exception ex)
        {
            LogError(ex)
        }

        if (!bootstrapPerformed)
            shutdownRequestedEvent.WaitOne(some timeout)
    }

    Second bootstrap action similar to above, etc.

    Third bootstrap action similar to above, etc.

    Eventually, start performing real work, while listening to 
    the shutdownRequestedEvent.

The services OnShutdown would signal the shutdownRequestedEvent and then wait for the RealMain thread to exit. 服务OnShutdown会向shutdownRequestedEvent发出信号,然后等待RealMain线程退出。

If the RealMain thread serves no purpose other then setup, it should perhaps be allowed to exit when it's done with all bootstrap tasks. 如果RealMain线程除设置外没有其他用途,则在完成所有引导任务后,可能应允许其退出。

Another thing to be careful about is to make sure your service, during normal operation, can withstand the temporary loss of access to a resource due to transient errors. 要注意的另一件事是,确保您的服务在正常运行期间可以承受由于瞬时错误而导致的对资源的暂时访问丢失。 For example, your service shouldn't crash just because someone reboots the database server. 例如,您的服务不应仅因为有人重新启动数据库服务器而崩溃。 It should just wait patiently and retry forever. 它应该耐心等待,然后重试。

An alternative approach that can work in some cases is to handle the bootstrapping as a dependency of whatever the real task is. 在某些情况下可以使用的另一种方法是将引导程序作为实际任务的依赖项来处理。 For instance, launch the real task, the real task will request a database session, to get that we must have the session factory, if we don't yet have the session factory, launch the session factory initialization. 例如,启动真实任务,真实任务将请求数据库会话,以获取我们必须具有会话工厂的信息;如果我们还没有会话工厂,则启动会话工厂的初始化。 If the session factory cannot be created, exception bubbles up and the whole task fails. 如果无法创建会话工厂,则异常冒泡并且整个任务失败。 The remaining work is now to wait a little while and then retry the task. 现在剩下的工作是稍等片刻,然后重试该任务。 Repeat forever. 永远重复一次。

Turned out to be a bug in NHibernate that prevented doing any of the above. 原来是NHibernate中的一个错误,阻止了上述任何操作。 Between Nibernate 2.0 and 3.0 you have to add the following to the NHibernate v3.0+ config (or in this case the FluentNHibernate): 在Nibernate 2.0和3.0之间,您必须将以下内容添加到NHibernate v3.0 +配置中(在本例中为FluentNHibernate):

cfg.SetProperty("hbm2ddl.keywords", "none");

This allows NHibernate to properly bootstrap itself and get to the controller now without error. 这使NHibernate能够正确地自我引导并立即进入控制器而不会出错。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM