简体   繁体   中英

Scheduling using the Task Parallel Library

I have to process around 200.000 objects (in a desktop application) and each object takes around 20 ms to process. In order to speed this up I want to do it concurrently.

For testing I just put each object in a separate task, but due to the small size of the job this only yields a tiny speed improvement. So my first question is:

Is there a clever (but not too complicated) way to find a optimal batch size for these objects? I guess I could to some local testing on whether it is fastest to group them together in batches of 10, 20 or 100 objects, but this seems a bit suboptimal.

Secondly (and more important): Most of the objects should just be processed whenever they get some CPU time. However, the user will always be looking at 10-20 objects. I want to always be able to put the objects the user is looking on in the front of the queue in order to deliver a smooth user experience. The user might navigate around all the time so I feel it is important to always be able to quickly reschedule the order. (20 ms * 20 should be able to be processed in around 0.4 seconds).

Can someone help me with a good design patten for processing these objects?

You could use Parallel.ForEach or Parallel.For if the objects are in a collection. Due to your user responsiveness requirements Parallel.For would be a better choice.

Unfortunately, there's no substitute for measuring performance and tweaking your strategy based on the results.

If you want to process the items in parallel and you don't care about the order, just use Parallel.ForEach() (call it from a background thread so that you don't block the UI thread).

But if you want to implement that dynamic priority change, that's going to be more complicated.

One way would be to have an object, let's call it Job , that would represent single action that has to be executed. Then you would have a method that processes a queue of jobs, but executing those with high priority if there are any. Something like:

Queue<Job> jobs;
IEnumerable<Job> priorityJobs;

void ProcessJobs()
{
    while (true)
    {
        Job job = null;

        lock (jobs)
        {
            job = priorityJobs.FirstOrDefault(j => j.NotYetStarted);

            if (job == null)
            {
                do
                {
                    if (jobs.Count == 0)
                        return;

                    job = jobs.Dequeue();
                } while (job.NotYetStarted);
            }

            job.NotYetStarted = false;
        }

        job.Execute();
    }
}

You would then start threads to execute ProcessJobs() in parallel, for example:

var tasks = Enumerable.Range(0, Environment.ProcessorCount)
    .Select(_ => Task.Run(() => ProcessJobs()));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM