I have got a List
which contains several Mesh
-Objects. This is how such a mesh-class looks like:
public class Mesh
{
public int GridWidth { get; }
public int GridHeight { get; }
public List<File> Files { get; }
/* ... */
}
The List
of files inside a mesh object contains File
-Objects that mostly consists of a string with the filesystems-path to the file and a two dimensional array which will hold the content of the file after it got parsed.
public class File
{
public string Path { get; }
public double[][] Matrix { get; set; }
/* ... */
}
Multithreading and parsing works fine. I have decided to launch as many threads as my CPU has single cores. In my case: 4.
With the help of Linq I concentrate all file-object in an own List at first:
List<File> allFiles = meshes.SelectMany(mesh => mesh.Files).ToList();
After that each Thread gets 1/4 of the Objects from this list and starts parsing the files.
And this is my problem : Files of the same size are located inside the same mesh ( GridWidth
* GridHeight
= Number of parsed matrix-cells). At this point it could happen by chance that one thread gets only files that have got a big size while another thread gets only files of low sizes. In this case one thread would finish earlier than the other thread(s) - and I don't want that because that would be inefficient.
So I had the idea to sort the list of meshes according to their size first and after that adding their files in orientation to the Shear Sort Method (or Snake Sort ) to a new List for each thread. The following algorithm works. But I think that their could be some room of improvement.
And these are my questions : Is this algorithm already efficient enough or does exist a better way for providing lists of files to each thread? If there isn't a better way I would be interested in a "smarter" way of coding (the for-loop seems a little bit complex with all its if/else and modulo operations).
int cores = 4;
List<File>[] filesOfThreads = new List<Slice>[cores];
List<File> allFilesDesc = meshes.OrderByDescending(mesh => mesh.GridWidth * mesh.GridHeight).SelectMany(mesh => mesh.Files).ToList();
int threadIndex = 0;
/*
* Inside this for-loop the threadIndex changes
* with each cycle in this way (in case of 4 cores):
* 0->1->2->3->3->2->1->0->0->1->2->3->3->2 ...
* With each cycle a file of the current position of
* allFilesDesc[i] is added to the list of
* filesOfThreads[threadIndex]. In this "shear" sort
* way every thread should get approximately the same
* number of big and small files.
*/
for (int i = 0; i < allFilesDesc.Count; i++)
{
if (i < cores)
{
filesOfThreads[threadIndex] = new List<File>();
}
filesOfThreads[threadIndex].Add(allFilesDesc[i]);
if (i < cores - 1)
{
threadIndex++;
}
else if ((i + 1) % cores != 0)
{
threadIndex += ((i + 1) / cores) % 2 == 0 ? 1 : -1;
}
}
foreach (var files in filesOfThreads)
{
Thread thread = new Thread(() => ComputeFiles(files));
thread.Start();
}
My suggest
/// <summary>
/// Helper methods for the lists.
/// </summary>
public static class ListExtensions
{
public static List<List<T>> ChunkBy<T>(this List<T> source, int chunkSize)
{
return source
.Select((x, i) => new { Index = i, Value = x })
.GroupBy(x => x.Index / chunkSize)
.Select(x => x.Select(v => v.Value).ToList())
.ToList();
}
}
For example, if you chuck the list of 18 items by 5 items per chunk, it gives you the list of 4 sublists with the following items inside: 5-5-5-3.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.