简体   繁体   English

线程池服务器的异步/等待等价物是什么?

[英]What is the async/await equivalent of a ThreadPool server?

I am working on a tcp server that looks something like this using synchronous apis and the thread pool:我正在使用同步 apis 和线程池在一个看起来像这样的 tcp 服务器上工作:

TcpListener listener;
void Serve(){
  while(true){
    var client = listener.AcceptTcpClient();
    ThreadPool.QueueUserWorkItem(this.HandleConnection, client);
    //Or alternatively new Thread(HandleConnection).Start(client)
  }
}

Assuming my goal is to handle as many concurrent connections as possible with the lowest resource usage, this seems that it will be quickly limited by the number of available threads.假设我的目标是以最低的资源使用率处理尽可能多的并发连接,这似乎很快就会受到可用线程数的限制。 I suspect that by using Non-blocking Task apis, I will be able to handle much more with fewer resources.我怀疑通过使用非阻塞任务 api,我将能够用更少的资源处理更多的事情。

My initial impression is something like:我的初步印象是这样的:

async Task Serve(){
  while(true){
    var client = await listener.AcceptTcpClientAsync();
    HandleConnectionAsync(client); //fire and forget?
  }
}

But it strikes me that this could cause bottlenecks.但令我震惊的是,这可能会导致瓶颈。 Perhaps HandleConnectionAsync will take an unusually long time to hit the first await, and will stop the main accept loop from proceeding.也许 HandleConnectionAsync 会花费异常长的时间来达到第一个等待,并且会阻止主接受循环继续进行。 Will this only use one thread ever, or will the runtime magically run things on multiple threads as it sees fit?这将永远只使用一个线程,还是运行时会神奇地在多个线程上运行它认为合适的东西?

Is there a way to combine these two approaches so that my server will use exactly the number of threads it needs for the number of actively running tasks, but so that it will not block threads unnecessarily on IO operations?有没有办法将这两种方法结合起来,以便我的服务器将准确地使用它需要的线程数来处理主动运行的任务数,但不会在 IO 操作上不必要地阻塞线程?

Is there an idiomatic way to maximize throughput in a situation like this?在这种情况下,是否有一种惯用的方法来最大化吞吐量?

I'd let the Framework manage the threading and wouldn't create any extra threads, unless profiling tests suggest I might need to.我会让框架管理线程并且不会创建任何额外的线程,除非分析测试表明我可能需要。 Especially, if the calls inside HandleConnectionAsync are mostly IO-bound.特别是,如果HandleConnectionAsync内部的调用主要是 IO 绑定的。

Anyway, if you like to release the calling thread (the dispatcher) at the beginning of HandleConnectionAsync , there's a very easy solution.无论如何,如果您想在HandleConnectionAsync的开头释放调用线程(调度程序),有一个非常简单的解决方案。 You can jump on a new thread from ThreadPool with await Yield() .您可以使用await Yield()ThreadPool跳转到一个新线程。 That works if you server runs in the execution environment which does not have any synchronization context installed on the initial thread (a console app, a WCF service), which is normally the case for a TCP server.如果您的服务器在初始线程(控制台应用程序、WCF 服务)上没有安装任何同步上下文的执行环境中运行,这将起作用,这通常是 TCP 服务器的情况。

The following illustrate this (the code is originally from here ).下面说明了这一点(代码最初来自此处)。 Note, the main while loop doesn't create any threads explicitly:请注意,主while循环不会显式创建任何线程:

using System;
using System.Collections.Generic;
using System.Net.Sockets;
using System.Text;
using System.Threading.Tasks;

class Program
{
    object _lock = new Object(); // sync lock 
    List<Task> _connections = new List<Task>(); // pending connections

    // The core server task
    private async Task StartListener()
    {
        var tcpListener = TcpListener.Create(8000);
        tcpListener.Start();
        while (true)
        {
            var tcpClient = await tcpListener.AcceptTcpClientAsync();
            Console.WriteLine("[Server] Client has connected");
            var task = StartHandleConnectionAsync(tcpClient);
            // if already faulted, re-throw any error on the calling context
            if (task.IsFaulted)
                await task;
        }
    }

    // Register and handle the connection
    private async Task StartHandleConnectionAsync(TcpClient tcpClient)
    {
        // start the new connection task
        var connectionTask = HandleConnectionAsync(tcpClient);

        // add it to the list of pending task 
        lock (_lock)
            _connections.Add(connectionTask);

        // catch all errors of HandleConnectionAsync
        try
        {
            await connectionTask;
            // we may be on another thread after "await"
        }
        catch (Exception ex)
        {
            // log the error
            Console.WriteLine(ex.ToString());
        }
        finally
        {
            // remove pending task
            lock (_lock)
                _connections.Remove(connectionTask);
        }
    }

    // Handle new connection
    private async Task HandleConnectionAsync(TcpClient tcpClient)
    {
        await Task.Yield();
        // continue asynchronously on another threads

        using (var networkStream = tcpClient.GetStream())
        {
            var buffer = new byte[4096];
            Console.WriteLine("[Server] Reading from client");
            var byteCount = await networkStream.ReadAsync(buffer, 0, buffer.Length);
            var request = Encoding.UTF8.GetString(buffer, 0, byteCount);
            Console.WriteLine("[Server] Client wrote {0}", request);
            var serverResponseBytes = Encoding.UTF8.GetBytes("Hello from server");
            await networkStream.WriteAsync(serverResponseBytes, 0, serverResponseBytes.Length);
            Console.WriteLine("[Server] Response has been written");
        }
    }

    // The entry point of the console app
    static async Task Main(string[] args)
    {
        Console.WriteLine("Hit Ctrl-C to exit.");
        await new Program().StartListener();
    }
}

Alternatively, the code might look like below, without await Task.Yield() .或者,代码可能如下所示,没有await Task.Yield() Note, I pass an async lambda to Task.Run , because I still want to benefit from async APIs inside HandleConnectionAsync and use await in there:请注意,我将async lambda传递Task.Run ,因为我仍然希望HandleConnectionAsync异步 API 中受益并在其中使用await

// Handle new connection
private static Task HandleConnectionAsync(TcpClient tcpClient)
{
    return Task.Run(async () =>
    {
        using (var networkStream = tcpClient.GetStream())
        {
            var buffer = new byte[4096];
            Console.WriteLine("[Server] Reading from client");
            var byteCount = await networkStream.ReadAsync(buffer, 0, buffer.Length);
            var request = Encoding.UTF8.GetString(buffer, 0, byteCount);
            Console.WriteLine("[Server] Client wrote {0}", request);
            var serverResponseBytes = Encoding.UTF8.GetBytes("Hello from server");
            await networkStream.WriteAsync(serverResponseBytes, 0, serverResponseBytes.Length);
            Console.WriteLine("[Server] Response has been written");
        }
    });
}

Updated , based upon the comment: if this is going to be a library code, the execution environment is indeed unknown, and may have a non-default synchronization context.更新,基于评论:如果这将是一个库代码,则执行环境确实未知,并且可能具有非默认同步上下文。 In this case, I'd rather run the main server loop on a pool thread (which is free of any synchronization context):在这种情况下,我宁愿在池线程(没有任何同步上下文)上运行主服务器循环:

private static Task StartListener()
{
    return Task.Run(async () => 
    {
        var tcpListener = TcpListener.Create(8000);
        tcpListener.Start();
        while (true)
        {
            var tcpClient = await tcpListener.AcceptTcpClientAsync();
            Console.WriteLine("[Server] Client has connected");
            var task = StartHandleConnectionAsync(tcpClient);
            if (task.IsFaulted)
                await task;
        }
    });
}

This way, all child tasks created inside StartListener wouldn't be affected by the synchronization context of the client code.这样,在StartListener创建的所有子任务都不会受到客户端代码的同步上下文的影响。 So, I wouldn't have to call Task.ConfigureAwait(false) anywhere explicitly.所以,我不必在任何地方显式调用Task.ConfigureAwait(false)

Updated in 2020, someone just asked a good question off-site: 2020年更新,刚有人在场外问了一个好问题:

I was wondering what is the reason for using a lock here?我想知道这里使用锁的原因是什么? This is not necessary for exception handling.这对于异常处理不是必需的。 My understanding is that a lock is used because List is not thread safe, therefore the real question is why add the tasks to a list (and incur the cost of a lock under load).我的理解是使用锁是因为 List 不是线程安全的,因此真正的问题是为什么将任务添加到列表中(并在负载下产生锁的成本)。

Since Task.Run is perfectly able to keep track of the tasks it started, my thinking is that in this specific example the lock is useless, however you put it there because in a real program, having the tasks in a list allows us to for example, iterate currently running tasks and terminate the tasks cleanly if the program receives a termination signal from the operating system.由于 Task.Run 完全能够跟踪它开始的任务,我的想法是在这个特定的例子中锁是没有用的,但是你把它放在那里是因为在一个真实的程序中,将任务放在列表中允许我们例如,如果程序从操作系统接收到终止信号,则迭代当前运行的任务并干净地终止任务。

Indeed, in a real-life scenario we almost always want to keep track of the tasks we start with Task.Run (or any other Task objects which are "in-flight"), for a few reasons:事实上,在现实生活中,我们几乎总是希望跟踪我们从Task.Run (或任何其他“运行中”的Task对象)开始的任务,原因如下:

  • To track task exceptions, which otherwise might be silently swallowed if go unobserved elsewhere.跟踪任务异常,否则如果在其他地方未被观察到,这些异常可能会被默默吞下
  • To be able to wait asynchronously for completion of all the pending tasks (eg, consider a Start/Stop UI button or handling a request to start/stop a inside a headless Windows service).能够异步等待所有待处理任务的完成(例如,考虑启动/停止 UI 按钮或处理启动/停止无头 Windows 服务内部的请求)。
  • To be able to control (and throttle/limit) the number of tasks we allow to be in-flight simultaneously.为了能够控制(和节流/限制)我们允许同时进行的任务数量。

There are better mechanisms to handle a real-life concurrency workflows (eg, TPL Dataflow Library), but I did include the tasks list and the lock on purpose here, even in this simple example.有更好的机制来处理现实生活中的并发工作流(例如,TPL 数据流库),但即使在这个简单的示例中,我也特意在此处包含了任务列表和锁。 It might be tempting to use a fire-and-forget approach, but it's almost never is a good idea.使用即发即弃的方法可能很诱人,但这几乎从来都不是一个好主意。 In my own experience, when I did want a fire-and-forget, I used async void methods for that (check this ).根据我自己的经验,当我确实想要一个即发即忘时,我为此使用了async void方法(检查这个)。

The existing answers have correctly proposed to use Task.Run(() => HandleConnection(client));现有答案已正确建议使用Task.Run(() => HandleConnection(client)); , but not explained why. ,但没有解释原因。

Here's why: You are concerned, that HandleConnectionAsync might take some time to hit the first await.原因如下:您担心HandleConnectionAsync可能需要一些时间才能达到第一个等待。 If you stick to using async IO (as you should in this case) this means that HandleConnectionAsync is doing CPU-bound work without any blocking.如果您坚持使用异步 IO(在这种情况下您应该这样做),这意味着HandleConnectionAsync正在执行 CPU 密集型工作而没有任何阻塞。 This is a perfect case for the thread-pool.这是线程池的完美案例。 It is made to run short, non-blocking CPU work.它旨在运行短暂的、非阻塞的 CPU 工作。

And you are right, that the accept loop would be throttled by HandleConnectionAsync taking a long time before returning (maybe because there is significant CPU-bound work in it).你是对的,接受循环会被HandleConnectionAsync限制在返回之前需要很长时间(可能是因为其中有大量的 CPU 绑定工作)。 This is to be avoided if you need a high frequency of new connections.如果您需要高频率的新连接,则应避免这种情况。

If you are sure that there is no significant work throttling the loop you can save the additional thread-pool Task and not do it.如果您确定没有重要的工作限制循环,您可以保存额外的线程池Task而不执行它。

Alternatively, you can have multiple accepts running at the same time.或者,您可以同时运行多个接受。 Replace await Serve();替换await Serve(); by (for example):通过(例如):

var serverTasks =
    Enumerable.Range(0, Environment.ProcessorCount)
    .Select(_ => Serve());
await Task.WhenAll(serverTasks);

This removes the scalability problems.这消除了可扩展性问题。 Note, that await will swallow all but one error here. 请注意,这里await将吞下除一个错误之外的所有错误。

Try尝试

TcpListener listener;
void Serve(){
  while(true){
    var client = listener.AcceptTcpClient();
    Task.Run(() => this.HandleConnection(client));
    //Or alternatively new Thread(HandleConnection).Start(client)
  }
}

According to the Microsoft http://msdn.microsoft.com/en-AU/library/hh524395.aspx#BKMK_VoidReturnType , the void return type shouldn't be used because it is not able to catch exceptions.根据 Microsoft http://msdn.microsoft.com/en-AU/library/hh524395.aspx#BKMK_VoidReturnType ,不应使用 void 返回类型,因为它无法捕获异常。 As you have pointed out you do need "fire and forget" tasks, so my conclusion is to that you must always return Task (as Microsoft have said), but you should catch the error using:正如您所指出的,您确实需要“即发即弃”任务,所以我的结论是您必须始终返回 Task(正如微软所说),但您应该使用以下方法捕获错误:

TaskInstance.ContinueWith(i => { /* exception handler */ }, TaskContinuationOptions.OnlyOnFaulted);

An example I used as proof is below:我用作证明的一个例子如下:

public static void Main()
{
    Awaitable()
        .ContinueWith(
            i =>
                {
                    foreach (var exception in i.Exception.InnerExceptions)
                    {
                        Console.WriteLine(exception.Message);
                    }
                },
            TaskContinuationOptions.OnlyOnFaulted);
    Console.WriteLine("This needs to come out before my exception");
    Console.ReadLine();
}

public static async Task Awaitable()
{
    await Task.Delay(3000);
    throw new Exception("Hey I can catch these pesky things");
}

Is there any reason you need to accept connections async?有什么理由需要接受异步连接吗? I mean, does awaiting any client connection give you any value?我的意思是,等待任何客户端连接会给您带来任何价值吗? The only reason for doing it would be because there are some other work going on in the server while waiting for a connection.这样做的唯一原因是因为在等待连接时服务器中还有一些其他工作正在进行。 If there is you could probably do something like this:如果有,你可能会做这样的事情:

    public async void Serve()
    {
        while (true)
        {
            var client = await _listener.AcceptTcpClientAsync();
            Task.Factory.StartNew(() => HandleClient(client), TaskCreationOptions.LongRunning);
        }
    }

This way the accepting will release the current thread leaving option for other things to be done, and the handling is run on a new thread.这样,接受将释放当前线程离开选项以完成其他事情,并且处理在新线程上运行。 The only overhead would be spawning a new thread for handling the client before it would go straight back to accepting a new connection.唯一的开销是在它直接返回接受新连接之前产生一个新线程来处理客户端。

Edit: Just realized it's almost the same code you wrote.编辑:刚刚意识到它几乎与您编写的代码相同。 Think I need to read your question again to better understand what you're actually asking :S认为我需要再次阅读您的问题以更好地了解您实际上在问什么:S

Edit2:编辑2:

Is there a way to combine these two approaches so that my server will use exactly the number of threads it needs for the number of actively running tasks, but so that it will not block threads unnecessarily on IO operations?有没有办法将这两种方法结合起来,以便我的服务器将准确地使用它需要的线程数来处理主动运行的任务数,但不会在 IO 操作上不必要地阻塞线程?

Think my solution actually answer this question.认为我的解决方案实际上回答了这个问题。 Is it really necessary though?然而真的有必要吗?

Edit3: Made Task.Factory.StartNew() actually create a new thread. Edit3:使 Task.Factory.StartNew() 实际上创建了一个新线程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM