简体   繁体   中英

Using multiple threads in C#

I want to do independent tasks of parsing multiple files in a system and get the version of each as follows:

public void obtainVersionList()
{

    for(int iterator = 1; iterator < list.length; iterator++) //list stores all the file names
    {
        Thread t = new Thread( () => GetVersion(ref list[iterator]) 
        //list will again store the fileVersions using GetVersion()
    }
}

Here,

  1. I get Index out of bounds exception. How's that possible as I've checked a condition iterator < list.length. Is this due to multiple threads running?
  2. How to minimize the operation time when we parse multiple files in the disk?

For parallel executions I'd recommend you Parallel.ForEach (or the Task class):

Parallel.ForEach(list, item => GetVersion(ref item));

The TPL you use then does the thread management for you, typically by using a thread pool. You can, however, use different scheduler implementations. In general, re-using threads is cheaper than spawning many.

Inspired by weston's suggestions I tried out an alternative, which may be considered creative LINQ usage :

static void Main(string[] args)
{
    var seq = Enumerable.Range(0, 10).ToList();
    var tasks = seq
        .Select(i => Task.Factory.StartNew(() => Foo(i)))
        .ToList(); // important, spawns the tasks
    var result = tasks.Select(t => t.Result);

    // no results are blockingly received before this
    // foreach loop
    foreach(var r in result)
    {
        Console.WriteLine(r);
    }
}

static int Foo(int i)
{
    return i;
}

For each input in seq I create a Task<T> doing something. The Result of these tasks is collected in result , which is not iterated before the foreach . This code does maintain the order of your results too.

The sample does not modify seq . This is a different concept than altering list as you want to do.

The iterator variable is being captured by reference, not by value. That makes all threads share the same variable. Copy it to a loop-local variable first before using it in the lambda.

Everyone falls for this at least once. The C# designers have regretted this decision so much they consider changing it.

To solve the index out of bounds problem you could make a local copy of the iteration variable:

for(int iterator = 1; iterator < list.length; iterator++) //list stores all the file names
{
     int iterator1 = iterator;
     Thread t = new Thread( () => GetVersion(ref list[iterator1]);
     //list will again store the fileVersions using GetVersion()
}

2) How to minimize the operation time when we parse multiple files in the disk?

That's not really a good idea when you have a single mechanical disk. You're only bouncing the mechanical head around as each thread gets a chance to run. Stick to a single thread for disk I/O.

See this question

Do not close over your iterator variable. Instead, create a local variable and close over that:

public void obtainVersionList()
{
    //list stores all the file names
    for(int iterator = 1; iterator < list.length; iterator++) 
    {
        //list will again store the fileVersions using GetVersion()
        var local = list[iterator];
        Thread t = new Thread( () => GetVersion(ref local);
    }
}

You shouldn't let multiple threads adjust the same list. This is not threadsafe unless the list is threadsafe. I don't know the type, but List<string> isn't.

The other thing is that you shouldn't create own threads for this. What if the list is 200 files, your PC will grind to a halt creating 200 threads. Let threadpool do the work of managing a sensible number of threads for you.

This solution assumes you have .net4.

Change signature of GetVersion to: private static string GetVersion(string file)

        var tasks = new List<Task>();
        //start tasks
        foreach (var file in list)
        {
            var localFile = file; //local variable on advice of resharper
            tasks.Add(Task<string>.Factory.StartNew(() => GetVersion(localFile)));
        }
        //wait for them to complete
        Task.WaitAll(tasks.ToArray());
        //read the results
        IEnumerable<string> result = tasks.OfType<Task<string>>().Select(e => e.Result);
        //print em out for test
        foreach (var str in result)
        {
            Console.WriteLine(str);
        }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM