繁体   English   中英

通过基于任务的异步模式 (TAP) 接触的 Windows 服务作业监视器

[英]Windows Service Job Monitor via Touch of Task-based Asynchronous Pattern (TAP)

我正在用 C# 构建一个 Windows 服务,它将监视一个作业队列,当它在队列中找到可用的项目时,它会启动将“完全”处理该作业(包括失败)的作业。 我正在使用 Task.Factory.StartNew() 并且无可否认,我对 TAP 非常熟悉(在完成这篇文章后开始阅读博客)。

要求

  1. 定期轮询数据库队列以查找可用作业。 (为了这个问题,让我们忽略消息队列与数据库队列的论点)

  2. 异步启动作业,以便轮询可以继续监视队列并启动新作业。

  3. 遵守“工作”门槛,以免产生过多工作。

  4. 如果作业正在处理,则“延迟”关闭服务。

  5. 确保将作业中的失败记录到事件日志中,并且不会使 Windows 服务崩溃。

我的代码在下面,但这里是我的问题/疑虑(如果有更好的地方可以发布此内容,请告诉我)但这主要围绕 TAP 的正确使用和它的“稳定性”。 请注意,我的大部分问题也记录在代码中。

问题

  1. 在 PollJobs 中,我使用 Task.Factory.Start/ContinueWith 的方式是否正确使用此作业处理服务来保持高吞吐量? 我永远不会阻塞任何线程,并希望对我目前拥有的一点点 TAP 使用正确的模式。

  2. ConcurrentDictionary - 使用它来监视当前正在运行的作业,并且每个作业在完成时都会从字典中删除自己(在我从 Task.Factory.StartNew 假设的单独线程上),因为它是 ConcurrentDictionary,我假设我不需要任何使用时随处锁定?

  3. 作业异常(最糟糕的是 OutOfMemoryException) - 作业处理期间的任何异常都无法关闭服务,必须正确记录在事件日志和数据库队列中。 目前,不幸的是,有些作业会抛出 OutOfMemoryException。 “作业处理”中的 try/catch 是否足以捕获和处理所有场景,以便 Windows 服务永远不会意外终止? 或者为每个工作启动一个 AppDomain 以获得更多隔离会更好/更安全吗? (过杀?)

  4. 我见过争论使用“正确”计时器的争论,但没有像样的答案。 对我的 System.Threading.Timer 的设置和使用有什么意见吗? (特别是关于我如何确保在上一次调用完成之前永远不会再次调用 PollJobs)

如果你已经做到了这一步。 提前致谢。

代码

public partial class EvolutionService : ServiceBase
{
    EventLog eventLog = new EventLog() { 
        Source = "BTREvolution", 
        Log = "Application" 
    };
    Timer timer;
    int pollingSeconds = 1;

    // Dictionary of currently running jobs so I can query current workload.  
    // Since ConcurrentDictionary, hoping I don't need any lock() { } code.
    private ConcurrentDictionary<Guid, RunningJob> runningJobs = 
        new ConcurrentDictionary<Guid, RunningJob>();

    public EvolutionService( string[] args )
    {
        InitializeComponent();

        if ( !EventLog.SourceExists( eventLog.Source ) )
        {
            EventLog.CreateEventSource( 
                eventLog.Source, 
                eventLog.Log );
        }
    }

    protected override void OnStart( string[] args )
    {
        // Only run polling code one time and the PollJobs will 
        // initiate next poll interval so that PollJobs is never 
        // called again before it finishes its processing, http://stackoverflow.com/a/1698409/166231
        timer = new System.Threading.Timer( 
            PollJobs, null, 
            TimeSpan.FromSeconds( 5 ).Milliseconds, 
            Timeout.Infinite );
    }

    protected override void OnPause()
    {
        // Disable the timer polling so no more jobs are processed
        timer = null;

        // Don't allow pause if any jobs are running
        if ( runningJobs.Count > 0 )
        {
            var searcher = new System.Management.ManagementObjectSearcher( 
                "SELECT UserName FROM Win32_ComputerSystem" );
            var collection = searcher.Get();
            var username = 
                (string)collection
                    .Cast<System.Management.ManagementBaseObject>()
                    .First()[ "UserName" ];
            throw new InvalidOperationException( $"{username} requested pause.  The service will not process incoming jobs, but it must finish the {runningJobs.Count} job(s) are running before it can be paused." );
        }

        base.OnPause();
    }

    protected override void OnContinue()
    {
        // Tell time to start polling one time in 5 seconds
        timer = new System.Threading.Timer( 
            PollJobs, null, 
            TimeSpan.FromSeconds( 5 ).Milliseconds, 
            Timeout.Infinite );
        base.OnContinue();
    }

    protected override void OnStop()
    {
        // Disable the timer polling so no more jobs are processed
        timer = null;

        // Until all jobs successfully cancel, keep requesting more time
        // http://stackoverflow.com/a/13952149/166231
        var task = Task.Run( () =>
        {
            // If any running jobs, send the Cancel notification
            if ( runningJobs.Count > 0 )
            {
                foreach ( var job in runningJobs )
                {
                    eventLog.WriteEntry( 
                        $"Cancelling job {job.Value.Key}" );
                    job.Value.CancellationSource.Cancel();
                }
            }

            // When a job cancels (and thus completes) it'll 
            // be removed from the runningJobs workload monitor.  
            // While any jobs are running, just wait another second
            while ( runningJobs.Count > 0 )
            {
                Task.Delay( TimeSpan.FromSeconds( 1 ) ).Wait();
            }
        } );

        // While the task is not finished, every 5 seconds 
        // I'll request an additional 5 seconds
        while ( !task.Wait( TimeSpan.FromSeconds( 5 ) ) )
        {
            RequestAdditionalTime( 
                TimeSpan.FromSeconds( 5 ).Milliseconds );
        }
    }

    public void PollJobs( object state )
    {
        // If no jobs processed, then poll at regular interval
        var timerDue = 
            TimeSpan.FromSeconds( pollingSeconds ).Milliseconds;

        try
        {
            // Could define some sort of threshhold here so it 
            // doesn't get too bogged down, might have to check
            // Jobs by type to know whether 'batch' or 'single' 
            // type jobs, for now, just not allowing more than
            // 10 jobs to run at once.
            var availableProcesses = 
                Math.Max( 0, 10 - runningJobs.Count );

            if ( availableProcesses == 0 ) return;

            var availableJobs = 
                JobProvider.TakeNextAvailableJobs( availableProcesses );
            foreach ( var j in availableJobs )
            {
                // If any jobs processed, poll immediately when finished
                timerDue = 0;

                var job = new RunningJob
                {
                    Key = j.jKey,
                    InputPackage = j.JobData.jdInputPackage,
                    DateStart = j.jDateStart.Value,
                    CancellationSource = new CancellationTokenSource()
                };

                // Add running job to the workload monitor
                runningJobs.AddOrUpdate(
                    j.jKey,
                    job,
                    ( key, info ) => null );

                Task.Factory
                    .StartNew(
                        i =>
                        {
                            var t = (Tuple<Guid, CancellationToken>)i;

                            var key = t.Item1; // Job Key
                            // Running process would check if cancel has been requested
                            var token = t.Item2; 
                            var totalProfilesProcess = 1;

                            try
                            {
                                eventLog.WriteEntry( $"Running job {key}" );

                                // All code in here completes the jobs.
                                // Will be a seperate method per JobType.
                                // Any exceptions in here *MUST NOT* 
                                // take down service.  Before allowing 
                                // the exception to propogate back up 
                                // into *this* try/catch, the code must 
                                // successfully clean up any resources 
                                // and state that was being modified so 
                                // that the client who submitted this 
                                // job is properly notified.

                                // This is just simulation of running a 
                                // job...so each job takes 10 seconds.
                                for ( var d = 0; d < 10; d++ )
                                {
                                    // or could do await if I called Unwrap(),
                                    // https://blogs.msdn.microsoft.com/pfxteam/2011/10/24/task-run-vs-task-factory-startnew/
                                    Task.Delay( 1000 ).Wait();
                                    totalProfilesProcess++;

                                    if ( token.IsCancellationRequested )
                                    {
                                        // TODO: Clean up the job
                                        throw new OperationCanceledException( token );
                                    }
                                }

                                // Success
                                JobProvider.UpdateJobStatus( key, 2, totalProfilesProcess );
                            }
                            catch ( OperationCanceledException )
                            {
                                // Cancelled
                                JobProvider.UpdateJobStatus( key, 3, totalProfilesProcess );
                                throw;
                            }
                            catch ( Exception )
                            {
                                // Failed
                                JobProvider.UpdateJobStatus( key, 4, totalProfilesProcess );
                                throw;
                            }
                        },
                        // Pass cancellation token to job so it can watch cancel request
                        Tuple.Create( j.jKey, job.CancellationSource.Token ),
                        // associate cancellation token with Task started via StartNew()
                        job.CancellationSource.Token, 
                        TaskCreationOptions.LongRunning,
                        TaskScheduler.Default

                    ).ContinueWith(
                        ( t, k ) =>
                        {
                            // When Job is finished, log exception if present.
                            // Haven't tested this yet, but think 
                            // Exception will always be AggregateException
                            // so I'll have to examine the InnerException.
                            if ( !t.IsCanceled && t.IsFaulted )
                            {
                                eventLog.WriteEntry( $"Exception for {k}: {t.Exception.Message}", EventLogEntryType.Error );
                            }

                            eventLog.WriteEntry( $"Completed job {k}" );

                            // Remove running job from the workload monitor
                            RunningJob completedJob;
                            runningJobs.TryRemove( 
                                (Guid)k, out completedJob );
                        },
                        j.jKey
                    );
            }
        }
        catch ( Exception ex )
        {
            // If can't even launch job, disable the polling.
            // TODO: Could have an error threshhold where I don't
            // shut down until X errors happens
            eventLog.WriteEntry( 
                ex.Message + "\r\n\r\n" + ex.StackTrace, 
                EventLogEntryType.Error );
            timer = null;
        }
        finally
        {
            // If timer wasn't 'terminated' in OnPause or OnStop, 
            // then set to call timer again
            if ( timer != null )
            {
                timer.Change( timerDue, Timeout.Infinite );
            }
        }
    }
}

class RunningJob
{
    public Guid Key { get; set; }
    public DateTime DateStart { get; set; }
    public XElement InputPackage { get; set; }
    public CancellationTokenSource CancellationSource { get; set; }
}

我用 Hangfire.io 解决了我的问题。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM