繁体   English   中英

限制号码 Dns.BeginGetHostEntry方法每秒生成的请求数或使用任务并行库(TPL)

[英]Limit the no. of requests per second generated by the Dns.BeginGetHostEntry method OR use Task parallel library(TPL)

我已经使用Dns.BeginGetHostEntry方法基于主机名获取主机的FQDN(主机名列表存储在SQL Server数据库中)。 此方法(异步)在不到30分钟的时间内完成了将近150k记录的运行,并在存储主机名的同一SQL表中更新了FQDN。

该解决方案运行速度太快(超过了每秒300个请求的阈值)。 由于允许的编号。 服务器生成请求的数量受到限制,我的服务器在顶部对话者中列出,并被要求停止运行此应用程序。 我必须重建此应用程序以使其同步运行,现在需要6多个小时才能完成。

//// TotalRecords are fetched from SQL database with the Hostname (referred as host further)
for (int i = 0; i < TotalRecords.Rows.Count; i++)
{
    try
    {
        host = TotalRecords.Rows[i].ItemArray[0].ToString();
        Interlocked.Increment(ref requestCounter);
        string[] arr = new string[] { i.ToString(), host }; 
        Dns.BeginGetHostEntry(host, GetHostEntryCallback,arr);
    }
    catch (Exception ex)
    {
        log.Error("Unknown error occurred\n ", ex);
    }
}
do
{
    Thread.Sleep(0);

} while (requestCounter>0);

ListAdapter.Update(总记录);

问题:

  1. 有没有办法每秒限制此方法生成的请求数量?

  2. 我的理解是ParallelOptions.MaxDegreeOfParallelism不控制每秒的线程,所以有什么办法可以让TPL成为更好的选择? 可以限制为否吗? 每秒的请求数?

SemaphoreSlimTimer一起使用可限制每个周期的请求。

[DebuggerDisplay( "Current Count = {_semaphore.CurrentCount}" )]
public class TimedSemaphoreSlim : IDisposable
{
    private readonly System.Threading.SemaphoreSlim _semaphore;
    private readonly System.Threading.Timer _timer;
    private int _releaseCount;

    public TimedSemaphoreSlim( int initialcount, TimeSpan period )
    {
        _semaphore = new System.Threading.SemaphoreSlim( initialcount );
        _timer = new System.Threading.Timer( OnTimer, this, period, period );
    }

    public TimedSemaphoreSlim( int initialCount, int maxCount, TimeSpan period )
    {
        _semaphore = new SemaphoreSlim( initialCount, maxCount );
        _timer = new Timer( OnTimer, this, period, period );
    }

    private void OnTimer( object state )
    {
        var releaseCount = Interlocked.Exchange( ref _releaseCount, 0 );
        if ( releaseCount > 0 )
            _semaphore.Release( releaseCount );
    }

    public WaitHandle AvailableWaitHandle => _semaphore.AvailableWaitHandle;
    public int CurrentCount => _semaphore.CurrentCount;

    public void Release()
    {
        Interlocked.Increment( ref _releaseCount );
    }

    public void Release( int releaseCount )
    {
        Interlocked.Add( ref _releaseCount, releaseCount );
    }

    public void Wait()
    {
        _semaphore.Wait();
    }

    public void Wait( CancellationToken cancellationToken )
    {
        _semaphore.Wait( cancellationToken );
    }

    public bool Wait( int millisecondsTimeout )
    {
        return _semaphore.Wait( millisecondsTimeout );
    }

    public bool Wait( int millisecondsTimeout, CancellationToken cancellationToken )
    {
        return _semaphore.Wait( millisecondsTimeout, cancellationToken );
    }

    public bool Wait( TimeSpan timeout, CancellationToken cancellationToken )
    {
        return _semaphore.Wait( timeout, cancellationToken );
    }

    public Task WaitAsync()
    {
        return _semaphore.WaitAsync();
    }

    public Task WaitAsync( CancellationToken cancellationToken )
    {
        return _semaphore.WaitAsync( cancellationToken );
    }

    public Task<bool> WaitAsync( int millisecondsTimeout )
    {
        return _semaphore.WaitAsync( millisecondsTimeout );
    }

    public Task<bool> WaitAsync( TimeSpan timeout )
    {
        return _semaphore.WaitAsync( timeout );
    }

    public Task<bool> WaitAsync( int millisecondsTimeout, CancellationToken cancellationToken )
    {
        return _semaphore.WaitAsync( millisecondsTimeout, cancellationToken );
    }

    public Task<bool> WaitAsync( TimeSpan timeout, CancellationToken cancellationToken )
    {
        return _semaphore.WaitAsync( timeout, cancellationToken );
    }

    #region IDisposable Support
    private bool disposedValue = false; // Dient zur Erkennung redundanter Aufrufe.

    private void CheckDisposed()
    {
        if ( disposedValue )
        {
            throw new ObjectDisposedException( nameof( TimedSemaphoreSlim ) );
        }
    }

    protected virtual void Dispose( bool disposing )
    {
        if ( !disposedValue )
        {
            if ( disposing )
            {
                _timer.Dispose();
                _semaphore.Dispose();
            }

            disposedValue = true;
        }
    }

    public void Dispose()
    {
        Dispose( true );
    }
    #endregion
}

样品用法

IEnumerable<string> bunchOfHosts = GetBunchOfHosts();
IList<IPHostEntry> result;

using ( var limiter = new TimedSemaphoreSlim( 300, 300, TimeSpan.FromSeconds( 1 ) ) )
{
    result = bunchOfHosts.AsParallel()
        .Select( e =>
        {
            limiter.Wait();
            try
            {
                return Dns.GetHostEntry( e );
            }
            finally
            {
                limiter.Release();
            }
        } )
        .ToList();
}

纯粹的异步解决方案。

它使用一个nuget包Nite.AsyncExSystem.Reactive它执行错误处理并提供DNS结果(作为IObservable<IPHostEntry>

这里有很多事情。 您将需要将反应式扩展理解为标准的异步编程 可能有很多方法可以达到以下效果,但这是一个有趣的解决方案。

using System;
using System.Collections.Concurrent;
using System.Threading.Tasks;
using System.Linq;
using System.Collections.Generic;
using System.Diagnostics;
using System.Net;
using System.Reactive.Disposables;
using System.Reactive.Linq;
using Nito.AsyncEx;
using System.Threading;

#pragma warning disable CS4014 // Because this call is not awaited, execution of the current method continues before the call is completed

public static class EnumerableExtensions
{
    public static IEnumerable<Func<U>> Defer<T, U>
        ( this IEnumerable<T> source, Func<T, U> selector) 
        => source.Select(s => (Func<U>)(() => selector(s)));
}


public class Program
{
    /// <summary>
    /// Returns the time to wait before processing another item
    /// if the rate limit is to be maintained
    /// </summary>
    /// <param name="desiredRateLimit"></param>
    /// <param name="currentItemCount"></param>
    /// <param name="elapsedTotalSeconds"></param>
    /// <returns></returns>
    private static double Delay(double desiredRateLimit, int currentItemCount, double elapsedTotalSeconds)
    {
        var time = elapsedTotalSeconds;
        var timeout = currentItemCount / desiredRateLimit;
        return timeout - time;
    }

    /// <summary>
    /// Consume the tasks in parallel but with a rate limit. The results
    /// are returned as an observable.
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="tasks"></param>
    /// <param name="rateLimit"></param>
    /// <returns></returns>
    public static IObservable<T> RateLimit<T>(IEnumerable<Func<Task<T>>> tasks, double rateLimit){
        var s = System.Diagnostics.Stopwatch.StartNew();
        var n = 0;
        var sem = new  AsyncCountdownEvent(1);

        var errors = new ConcurrentBag<Exception>();

        return Observable.Create<T>
            ( observer =>
            {

                var ctx = new CancellationTokenSource();
                Task.Run
                    ( async () =>
                    {
                        foreach (var taskFn in tasks)
                        {
                            n++;
                            ctx.Token.ThrowIfCancellationRequested();

                            var elapsedTotalSeconds = s.Elapsed.TotalSeconds;
                            var delay = Delay( rateLimit, n, elapsedTotalSeconds );
                            if (delay > 0)
                                await Task.Delay( TimeSpan.FromSeconds( delay ), ctx.Token );

                            sem.AddCount( 1 );
                            Task.Run
                                ( async () =>
                                {
                                    try
                                    {
                                        observer.OnNext( await taskFn() );
                                    }
                                    catch (Exception e)
                                    {
                                        errors.Add( e );
                                    }
                                    finally
                                    {
                                        sem.Signal();
                                    }
                                }
                                , ctx.Token );
                        }
                        sem.Signal();
                        await sem.WaitAsync( ctx.Token );
                        if(errors.Count>0)
                            observer.OnError(new AggregateException(errors));
                        else
                            observer.OnCompleted();
                    }
                      , ctx.Token );

                return Disposable.Create( () => ctx.Cancel() );
            } );
    }

    #region hosts



    public static string [] Hosts = new [] { "google.com" }

    #endregion


    public static void Main()
    {
        var s = System.Diagnostics.Stopwatch.StartNew();

        var rate = 25;

        var n = Hosts.Length;

        var expectedTime = n/rate;

        IEnumerable<Func<Task<IPHostEntry>>> dnsTaskFactories = Hosts.Defer( async host =>
        {
            try
            {
                return await Dns.GetHostEntryAsync( host );
            }
            catch (Exception e)
            {
                throw new Exception($"Can't resolve {host}", e);
            }
        } );

        IObservable<IPHostEntry> results = RateLimit( dnsTaskFactories, rate );

        results
            .Subscribe( result =>
            {
                Console.WriteLine( "result " + DateTime.Now + " " + result.AddressList[0].ToString() );
            },
            onCompleted: () =>
            {
                Console.WriteLine( "Completed" );

                PrintTimes( s, expectedTime );
            },
            onError: e =>
            {
                Console.WriteLine( "Errored" );

                PrintTimes( s, expectedTime );

                if (e is AggregateException ae)
                {
                    Console.WriteLine( e.Message );
                    foreach (var innerE in ae.InnerExceptions)
                    {
                        Console.WriteLine( $"     " + innerE.GetType().Name + " " + innerE.Message );
                    }
                }
                else
                {
                        Console.WriteLine( $"got error " + e.Message );
                }
            }

            );

        Console.WriteLine("Press enter to exit");
        Console.ReadLine();
    }

    private static void PrintTimes(Stopwatch s, int expectedTime)
    {
        Console.WriteLine( "Done" );
        Console.WriteLine( "Elapsed Seconds " + s.Elapsed.TotalSeconds );
        Console.WriteLine( "Expected Elapsed Seconds " + expectedTime );
    }
}

输出的最后几行是

result 5/23/2017 3:23:36 PM 84.16.241.74
result 5/23/2017 3:23:36 PM 84.16.241.74
result 5/23/2017 3:23:36 PM 157.7.105.52
result 5/23/2017 3:23:36 PM 223.223.182.225
result 5/23/2017 3:23:36 PM 64.34.93.5
result 5/23/2017 3:23:36 PM 212.83.211.103
result 5/23/2017 3:23:36 PM 205.185.216.10
result 5/23/2017 3:23:36 PM 198.232.125.32
result 5/23/2017 3:23:36 PM 66.231.176.100
result 5/23/2017 3:23:36 PM 54.239.34.12
result 5/23/2017 3:23:36 PM 54.239.34.12
result 5/23/2017 3:23:37 PM 219.84.203.116
Errored
Done
Elapsed Seconds 19.9990118
Expected Elapsed Seconds 19
One or more errors occurred.
     Exception Can't resolve adv758968.ru
     Exception Can't resolve fr.a3dfp.net
     Exception Can't resolve ads.adwitserver.com
     Exception Can't resolve www.adtrader.com
     Exception Can't resolve trak-analytics.blic.rs
     Exception Can't resolve ads.buzzcity.net

我无法粘贴完整的代码,因此这是带有主机列表的代码链接。

https://gist.github.com/bradphelan/084e4b1ce2604bbdf858d948699cc190

您是否曾经考虑过使用TPL Dataflow库? 它有一个非常方便的方法来限制相同类型的并发操作。 它还有机会通过限制缓冲区大小来限制整个管道。

基本上,您需要创建的是带有以下内容的管道:

所以您的代码可能像这样:

// buffer limited to 30 items in queue
// all other items would be postponed and added to queue automatically
// order in queue is preserved
var hosts = new BufferBlock<string>(new DataflowBlockOptions { BoundedCapacity = 30 });

// get a host and perform a dns search operation
var handler = new TransformBlock<string, IPHostEntry>(host => Dns.GetHostEntry(host),
  // no more than 5 simultaneous requests at a time
  new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 10 });

// gather results in an array of size 500 
var batchBlock = new BatchBlock<IPHostEntry>(500);

// get the resulting array and save it to database
var batchSave = new ActionBlock<IPHostEntry[]>(r => GetHostEntryCallback(r));

// link all the blocks to automatically propagate items along the pipeline
var linkOptions = new DataflowLinkOptions { PropagateCompletion = true };
hosts.LinkTo(handler, linkOptions);
handler.LinkTo(batchBlock, linkOptions);
batchBlock.LinkTo(batchSave, linkOptions);

// provide the data to pipeline
for (var i = 0; i < TotalRecords.Rows.Count; ++i)
{
    var host = TotalRecords.Rows[i].ItemArray[0].ToString();
    // async wait for item to be sent to pipeline
    // will throttle starting with 31th item in a buffer queue
    await hosts.SendAsync(host);
}

// pipeline is complete now, just wait it finishes
hosts.Complete();

// wait for the last block to finish it's execution
await batchSave.Completion;

// notify user that update is over

我鼓励您阅读有关MSDN的整个How-to部分,以更好地了解您可以使用此库做什么,也许可以继续阅读官方文档

顺便说一句,您可以使用SqlBulkCopy类更新数据库 (如果它满足您的要求),通常它比使用SqlDataAdapter进行常规更新要快。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM