简体   繁体   中英

Using TPL with parallel blocking IO operations

Preface: I'm aware that using the ThreadPool (either via TPL or directly) for IO operations is generally frowned upon because IO is necessarily sequential, however my problem relates to "parallel IO" with blocking calls that don't expose an Async method.

I'm writing a GUI tool that fetches information about computers on the network that does this (simplified code):

String[] computerNames = { "foo", "bar", "baz" };
foreach(String computerName in computerNames) {

    Task.Factory
        .StartNew( GetComputerInfo, computerName )
        .ContinueWith( ShowOutputInGui, RunOnGuiThread );

}

private ComputerInfo GetComputerInfo(String machineName) {

    Task<Int64>     pingTime  = Task.Factory.StartNew( () => GetPingTime( machineName ) );
    Task<Process[]> processes = Task.Factory.StartNew( () => System.Diagnostics.Process.GetProcesses( machineName ) );
    // and loads more

    Task.WaitAll( pingtime, processes, etc );

    return new ComputerInfo( pingTime.Result, processes.Result, etc );
}

When I run this code I'm finding it takes a surprisingly long amount of time to run compared to the old sequential code I had.

Note that each task in the GetComputerInfo method is entirely independent of others around it (eg Ping time can be computed separately from GetProcesses ), yet when I inserted some Stopwatch timing calls, I discovered that the individual sub-tasks, like the GetProcesses call were only being started up to 3000ms after GetComputerInfo had been called - there exists some large delay going on.

I noticed that when I reduce the number of outer parallel calls into GetComputerInfo (by reducing the size of the computerNames array) the first results were returned almost immediately. Some of the computer-names are for computers that are turned-off, so called to GetProcesses and PingTime take a very long time before timing out (my real code catches the exceptions). This is probably because the offline computers are blocking Tasks being run and the TPL naturally restricts it to my CPU hardware thread count (8).

Is there a way to tell TPL to not let the inner tasks (eg GetProcesses ) block outer tasks ( GetComputerInfo )?

(I've tried the "Parent/Child" task attachment/blocking, but it doesn't apply to my situation as I never explicitly attach child tasks to parent tasks, and the parent task naturally waits with Task.WaitAll anyway).

I assume that you have your foreach loop in some event handler, so first thing you should do is to mark it as async so you can call your other in async way. After that, you should introduce your GetComputerInfo to do async all the way down .

There are additional pitfalls in your code: StartNew is dangerous , as it uses Current scheduler for tasks, rather than Default (so you need other overload). Unfortunately, that overload needs some more parameters, so the code will be not so simple. The good news is that you still need that overload to tell the thread pool that your tasks are long running so it should use a dedicated thread for them:

TaskCreationOptions.LongRunning

Specifies that a task will be a long-running, coarse-grained operation involving fewer, larger components than fine-grained systems. It provides a hint to the TaskScheduler that oversubscription may be warranted.

Oversubscription lets you create more threads than the available number of hardware threads. It also provides a hint to the task scheduler that an additional thread might be required for the task so that it does not block the forward progress of other threads or work items on the local thread-pool queue.

Also you should avoid the WaitAll method as it is a blocking operation, so you have 1 thread less to do the actual work. You probably want to use WhenAll .

And finally, for returning your ComputerInfo result you can use the continuation with TaskCompletionSource usage, so your code could be something like this (cancellation logic also added):

using System.Diagnostics;

// handle event in fire-and-forget manner
async void btn_Click(object sender, EventArgs e)
{
    var computerNames = { "foo", "bar", "baz" };
    foreach(String computerName in computerNames)
    {
        var compCancelSource = new CancellationTokenSource();

        // asynchronically wait for next computer info
        var compInfo = await GetComputerInfo(computerName, compCancelSource. Token);
        // We are in UI context here
        ShowOutputInGui(compInfo);
        RunOnGuiThread(compInfo);
    }
}

private Task<ComputerInfo> GetComputerInfo(String machineName, CancellationToken token)
{
    var pingTime = Task.Factory.StartNew(
        // action to run
        () => GetPingTime(machineName),
        //token to cancel
        token,
        // notify the thread pool that this task could take a long time to run,
        // so the new thread probably will be used for it
        TaskCreationOptions.LongRunning,
        // execute all the job in a thread pool
        TaskScheduler.Default);

    var processes = Task.Run(() => Process.GetProcesses(machineName), token, TaskCreationOptions.LongRunning, TaskScheduler.Default);
    // and loads more

    await Task.WhenAll(pingtime, processes, etc);
    return new ComputerInfo(pingTime.Result, processes.Result, etc);

    //var tcs = new TaskCompletionSource<ComputerInfo>();
    //Task.WhenAll(pingtime, processes, etc)
    //    .ContinueWith(aggregateTask =>
    //        if (aggregateTask.IsCompleted)
    //        {
    //            tcs.SetResult(new ComputerInfo(
    //                aggregateTask.Result[0],
    //                aggregateTask.Result[1],
    //                etc));
    //        }
    //        else
    //        {
    //            // cancel or error handling
    //        });

    // return the awaitable task
    //return tcs.Task;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM