简体   繁体   中英

Limit the number of parallel threads in C#

I am writing a C# program to generate and upload a half million files via FTP. I want to process 4 files in parallel since the machine have 4 cores and the file generating takes much longer time. Is it possible to convert the following Powershell example to C#? Or is there any better framework such as Actor framework in C# (like F# MailboxProcessor)?

Powershell example

$maxConcurrentJobs = 3;

# Read the input and queue it up
$jobInput = get-content .\input.txt
$queue = [System.Collections.Queue]::Synchronized( (New-Object System.Collections.Queue) )
foreach($item in $jobInput)
{
    $queue.Enqueue($item)
}

# Function that pops input off the queue and starts a job with it
function RunJobFromQueue
{
    if( $queue.Count -gt 0)
    {
        $j = Start-Job -ScriptBlock {param($x); Get-WinEvent -LogName $x} -ArgumentList $queue.Dequeue()
        Register-ObjectEvent -InputObject $j -EventName StateChanged -Action { RunJobFromQueue; Unregister-Event $eventsubscriber.SourceIdentifier; Remove-Job $eventsubscriber.SourceIdentifier } | Out-Null
    }
}

# Start up to the max number of concurrent jobs
# Each job will take care of running the rest
for( $i = 0; $i -lt $maxConcurrentJobs; $i++ )
{
    RunJobFromQueue
}

Update:
The connection to remote FTP server can be slow so I want to limit the FTP uploading processing.

Assuming you're building this with the TPL, you can set the ParallelOptions.MaxDegreesOfParallelism to whatever you want it to be.

Parallel.For for a code example.

Task Parallel Library is your friend here. See this link which describes what's available to you. Basically framework 4 comes with it which optimises these essentially background thread pooled threads to the number of processors on the running machine.

Perhaps something along the lines of:

ParallelOptions options = new ParallelOptions();

options.MaxDegreeOfParallelism = 4;

Then in your loop something like:

Parallel.Invoke(options,
 () => new WebClient().Upload("http://www.linqpad.net", "lp.html"),
 () => new WebClient().Upload("http://www.jaoo.dk", "jaoo.html"));

If you are using .Net 4.0 you can use the Parallel library

Supposing you're iterating throug the half million of files you can "parallel" the iteration using a Parallel Foreach for instance or you can have a look to PLinq Here a comparison between the two

Essentially you're going to want to create an Action or Task for each file to upload, put them in a List, and then process that list, limiting the number that can be processed in parallel.

My blog post shows how to do this both with Tasks and with Actions, and provides a sample project you can download and run to see both in action.

With Actions

If using Actions, you can use the built-in .Net Parallel.Invoke function. Here we limit it to running at most 4 threads in parallel.

var listOfActions = new List<Action>();
foreach (var file in files)
{
    var localFile = file;
    // Note that we create the Task here, but do not start it.
    listOfTasks.Add(new Task(() => UploadFile(localFile)));
}

var options = new ParallelOptions {MaxDegreeOfParallelism = 4};
Parallel.Invoke(options, listOfActions.ToArray());

This option doesn't support async though, and I'm assuming you're FileUpload function will be, so you might want to use the Task example below.

With Tasks

With Tasks there is no built-in function. However, you can use the one that I provide on my blog.

    /// <summary>
    /// Starts the given tasks and waits for them to complete. This will run, at most, the specified number of tasks in parallel.
    /// <para>NOTE: If one of the given tasks has already been started, an exception will be thrown.</para>
    /// </summary>
    /// <param name="tasksToRun">The tasks to run.</param>
    /// <param name="maxTasksToRunInParallel">The maximum number of tasks to run in parallel.</param>
    /// <param name="cancellationToken">The cancellation token.</param>
    public static async Task StartAndWaitAllThrottledAsync(IEnumerable<Task> tasksToRun, int maxTasksToRunInParallel, CancellationToken cancellationToken = new CancellationToken())
    {
        await StartAndWaitAllThrottledAsync(tasksToRun, maxTasksToRunInParallel, -1, cancellationToken);
    }

    /// <summary>
    /// Starts the given tasks and waits for them to complete. This will run the specified number of tasks in parallel.
    /// <para>NOTE: If a timeout is reached before the Task completes, another Task may be started, potentially running more than the specified maximum allowed.</para>
    /// <para>NOTE: If one of the given tasks has already been started, an exception will be thrown.</para>
    /// </summary>
    /// <param name="tasksToRun">The tasks to run.</param>
    /// <param name="maxTasksToRunInParallel">The maximum number of tasks to run in parallel.</param>
    /// <param name="timeoutInMilliseconds">The maximum milliseconds we should allow the max tasks to run in parallel before allowing another task to start. Specify -1 to wait indefinitely.</param>
    /// <param name="cancellationToken">The cancellation token.</param>
    public static async Task StartAndWaitAllThrottledAsync(IEnumerable<Task> tasksToRun, int maxTasksToRunInParallel, int timeoutInMilliseconds, CancellationToken cancellationToken = new CancellationToken())
    {
        // Convert to a list of tasks so that we don't enumerate over it multiple times needlessly.
        var tasks = tasksToRun.ToList();

        using (var throttler = new SemaphoreSlim(maxTasksToRunInParallel))
        {
            var postTaskTasks = new List<Task>();

            // Have each task notify the throttler when it completes so that it decrements the number of tasks currently running.
            tasks.ForEach(t => postTaskTasks.Add(t.ContinueWith(tsk => throttler.Release())));

            // Start running each task.
            foreach (var task in tasks)
            {
                // Increment the number of tasks currently running and wait if too many are running.
                await throttler.WaitAsync(timeoutInMilliseconds, cancellationToken);

                cancellationToken.ThrowIfCancellationRequested();
                task.Start();
            }

            // Wait for all of the provided tasks to complete.
            // We wait on the list of "post" tasks instead of the original tasks, otherwise there is a potential race condition where the throttler's using block is exited before some Tasks have had their "post" action completed, which references the throttler, resulting in an exception due to accessing a disposed object.
            await Task.WhenAll(postTaskTasks.ToArray());
        }
    }

And then creating your list of Tasks and calling the function to have them run, with say a maximum of 4 simultaneous at a time, you could do this:

var listOfTasks = new List<Task>();
foreach (var file in files)
{
    var localFile = file;
    // Note that we create the Task here, but do not start it.
    listOfTasks.Add(new Task(async () => await UploadFile(localFile)));
}
await Tasks.StartAndWaitAllThrottledAsync(listOfTasks, 4);

Also, because this method supports async, it will not block the UI thread like using Parallel.Invoke or Parallel.ForEach would.

I have coded below technique where I use BlockingCollection as a thread count manager. It is quite simple to implement and handles the job. It simply accepts Task objects and add an integer value to blocking list, increasing running thread count by 1. When thread finishes, it dequeues the object and releases the block on add operation for upcoming tasks.

        public class BlockingTaskQueue
        {
            private BlockingCollection<int> threadManager { get; set; } = null;
            public bool IsWorking
            {
                get
                {
                    return threadManager.Count > 0 ? true : false;
                }
            }

            public BlockingTaskQueue(int maxThread)
            {
                threadManager = new BlockingCollection<int>(maxThread);
            }

            public async Task AddTask(Task task)
            {
                Task.Run(() =>
                {
                    Run(task);
                });
            }

            private bool Run(Task task)
            {
                try
                {
                    threadManager.Add(1);
                    task.Start();
                    task.Wait();
                    return true;

                }
                catch (Exception ex)
                {
                    return false;
                }
                finally
                {
                    threadManager.Take();
                }

            }

        }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM