I want to do independent tasks of parsing multiple files in a system and get the version of each as follows:
public void obtainVersionList()
{
for(int iterator = 1; iterator < list.length; iterator++) //list stores all the file names
{
Thread t = new Thread( () => GetVersion(ref list[iterator])
//list will again store the fileVersions using GetVersion()
}
}
Here,
For parallel executions I'd recommend you Parallel.ForEach
(or the Task
class):
Parallel.ForEach(list, item => GetVersion(ref item));
The TPL you use then does the thread management for you, typically by using a thread pool. You can, however, use different scheduler implementations. In general, re-using threads is cheaper than spawning many.
Inspired by weston's suggestions I tried out an alternative, which may be considered creative LINQ usage :
static void Main(string[] args)
{
var seq = Enumerable.Range(0, 10).ToList();
var tasks = seq
.Select(i => Task.Factory.StartNew(() => Foo(i)))
.ToList(); // important, spawns the tasks
var result = tasks.Select(t => t.Result);
// no results are blockingly received before this
// foreach loop
foreach(var r in result)
{
Console.WriteLine(r);
}
}
static int Foo(int i)
{
return i;
}
For each input in seq
I create a Task<T>
doing something. The Result
of these tasks is collected in result
, which is not iterated before the foreach
. This code does maintain the order of your results too.
The sample does not modify seq
. This is a different concept than altering list
as you want to do.
The iterator
variable is being captured by reference, not by value. That makes all threads share the same variable. Copy it to a loop-local variable first before using it in the lambda.
Everyone falls for this at least once. The C# designers have regretted this decision so much they consider changing it.
To solve the index out of bounds problem you could make a local copy of the iteration variable:
for(int iterator = 1; iterator < list.length; iterator++) //list stores all the file names
{
int iterator1 = iterator;
Thread t = new Thread( () => GetVersion(ref list[iterator1]);
//list will again store the fileVersions using GetVersion()
}
2) How to minimize the operation time when we parse multiple files in the disk?
That's not really a good idea when you have a single mechanical disk. You're only bouncing the mechanical head around as each thread gets a chance to run. Stick to a single thread for disk I/O.
See this question
Do not close over your iterator variable. Instead, create a local variable and close over that:
public void obtainVersionList()
{
//list stores all the file names
for(int iterator = 1; iterator < list.length; iterator++)
{
//list will again store the fileVersions using GetVersion()
var local = list[iterator];
Thread t = new Thread( () => GetVersion(ref local);
}
}
You shouldn't let multiple threads adjust the same list. This is not threadsafe unless the list is threadsafe. I don't know the type, but List<string>
isn't.
The other thing is that you shouldn't create own threads for this. What if the list is 200 files, your PC will grind to a halt creating 200 threads. Let threadpool do the work of managing a sensible number of threads for you.
This solution assumes you have .net4.
Change signature of GetVersion to: private static string GetVersion(string file)
var tasks = new List<Task>();
//start tasks
foreach (var file in list)
{
var localFile = file; //local variable on advice of resharper
tasks.Add(Task<string>.Factory.StartNew(() => GetVersion(localFile)));
}
//wait for them to complete
Task.WaitAll(tasks.ToArray());
//read the results
IEnumerable<string> result = tasks.OfType<Task<string>>().Select(e => e.Result);
//print em out for test
foreach (var str in result)
{
Console.WriteLine(str);
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.