简体   繁体   English

Parallel.Foreach 与所有并行进程共享绝对索引

[英]Parallel.Foreach with shared absolute index for all the parallel processes

I'm running a process which, for example, works on 10 files at a time.我正在运行一个进程,例如,一次处理 10 个文件。 I need to assign a serial number based on the input order of the input file array.我需要根据输入文件数组的输入顺序分配一个序列号。 So, for each parallel process, I need to assign the serial numbers to be used in the parallel process in the same order as the input string array myFiles .因此,对于每个并行进程,我需要按照与输入字符串数组myFiles相同的顺序分配要在并行进程中使用的序列号。 Do I need some type of threadsafe or concurrent int?我需要某种类型的线程安全或并发 int 吗? What's the correct approach?什么是正确的方法?

var results = new ConcurrentQueue<string>();
var options = new ParallelOptions
    { MaxDegreeOfParallelism = Environment.ProcessorCount * 10 };
int startSerialNumber = 1;
if (runParallel)
{
    Parallel.ForEach(myFiles, options, (myFile) =>
    {
        var newMyFile = WorkOnMyFile(myFile,startSerialNumber);
        startSerialNumber += SubFileCount; // <--This needs to be shared
            // for all parallel processes where how do I control incrementing?
        results.Enqueue(RunExeTask(newMyFile, outputDirectory,false));
    });
}

Generate the serial numbers outside of the parallel processing.在并行处理之外生成序列号。 Incrementing a number is trivial so it's not like you need to do it on multiple threads.增加一个数字是微不足道的,所以它不像你需要在多个线程上做。 As you generate them, pair them with the items in your list to create a new list containing both, then iterate over that.在生成它们时,将它们与列表中的项目配对以创建一个包含两者的新列表,然后对其进行迭代。

var myData = myFiles
    .Select
    (
        (f, i) => new { File = f, SerialNumber = startingSerialNuber + (i * SubFileCount) }
    )
    .ToList();
Parallel.ForEach(myData, options, (myItem) =>
{
    myFile = myItem.File;
    serialNumber = myItem.SerialNumber;
    var newMyFile = WorkOnMyFile(myFile,serialNumber);
    results.Enqueue(RunExeTask(newMyFile, outputDirectory,false));
});

I suggest to use PLINQ instead of the Parallel class, because the former is inherently able to collect thread-safely the processed results, and return them (optionally) in their original order.我建议使用PLINQ而不是Parallel class,因为前者本质上能够以线程安全的方式收集处理后的结果,并以原始顺序(可选)返回它们。 It also makes easy to get the index of the currently processed item, by using the Select overload that accepts an index:通过使用接受索引的Select重载,它还可以轻松获取当前处理的项目的索引:

public static ParallelQuery<TResult> Select<TSource, TResult> (
    this ParallelQuery<TSource> source,
    Func<TSource, int, TResult> selector);

Usage example:使用示例:

string[] results = myFiles
    .AsParallel()
    .AsOrdered() // Optional, by default the original order will not be preserved
    .WithDegreeOfParallelism(runParallel ? Environment.ProcessorCount : 1)
    .Select((myFile, index) =>
    {
        var newMyFile = WorkOnMyFile(myFile, index);
        return RunExeTask(newMyFile, outputDirectory, false);
    }).ToArray();

You could try something like this:你可以尝试这样的事情:

    var results = new ConcurrentQueue<string>();
    var options = new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount * 10 };
    int startSerialNumber = 1;
    if (runParallel)
    {
        new Thread(() =>
        {
            Task.Run(() =>
            {
                Parallel.ForEach(myFiles, options, (myFile) =>
                {
                    var newMyFile = WorkOnMyFile(myFile, startSerialNumber);
                    Interlocked.Add(ref startSerialNumber, SubFileCount);
                results.Enqueue(RunExeTask(newMyFile, outputDirectory, false));
                });
            }).Wait();
        }).Start();
    }

Essentially this runs your operation on a background thread that waits for a task to complete before carrying on.本质上,这在后台线程上运行您的操作,该线程等待任务完成后再继续。 As long as you don't reference startSerialNumber anywhere else you should be fine.只要您不在其他任何地方引用 startSerialNumber 就可以了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM