简体   繁体   English

并行运行任务并连接 output

[英]Run tasks in parallel and concatenate the output

I'd like to fetch multiple data providers and they return they same structure of data, but with different data output.我想获取多个数据提供者,它们返回相同的数据结构,但使用不同的数据 output。 At the end the output of the datasources needs to be appended so I can use the total result.最后需要附加数据源的 output 以便我可以使用总结果。 To improve performance these datasources need to be called in parallel.为了提高性能,需要并行调用这些数据源。 I am now having this solution:我现在有这个解决方案:

Task<List<Result>> dataSource1 = null;
Task<List<Result>> dataSource2 = null;
foreach (var dataSource in dataSourcesToBeFetched)
        {
            switch (dataSource)
            {
                case DataSource.DataSource1:
                    dataSource1 = DataSource1();
                    break;

                case DataSource.DataSource2:
                    dataSource2 =DataSource2();
                    break;
            }
        }
await Task.WhenAll(dataSource1, dataSource2);
var allData = dataSource1.Result.Append(dataSource2.Result)

But I am not happy with it.但我对此并不满意。 When adding more data sources, I need to append the new result to the list, which looks ugly.添加更多数据源时,我需要将 append 新结果添加到列表中,看起来很难看。 Besides that, I'd like to use switch expressions, but I am struggling with this.除此之外,我想使用 switch 表达式,但我正在为此苦苦挣扎。

A problem in your code is, that if the DataSource.DataSource1 is not present in the dataSourcesToBeFetched, you are awaiting a null task.您的代码中的一个问题是,如果DataSource.DataSource1中不存在 DataSource.DataSource1,则您正在等待 null 任务。

I would probably go for a collection of tasks to await.我可能会 go 等待一系列任务。

Something like:就像是:

var dataSources = new List<Task<List<Result>>>();

// check if the DataSource1 is present in the dataSourcesToBeFetched
if(dataSourcesToBeFetched.Any(i => i == DataSource.DataSource1))
    dataSources.Add(DataSource1());

// check if the DataSource2 is present in the dataSourcesToBeFetched
if(dataSourcesToBeFetched.Any(i => i == DataSource.DataSource2))
    dataSources.Add(DataSource2());

// a list to hold all results
var allData = new List<Result>();

// if we need to fetch any, await all tasks.
if(dataSources.Count > 0)
{
    await Task.WhenAll(dataSources);

    // add the results to the list.
    foreach(var dataSource in dataSources)
        allData.AddRange(dataSource.Result);
}

All this code can be replaced with:所有这些代码都可以替换为:

var results=await Task.WhenAll(DataSource1(),DataSource2());

The Task.WhenAll< TResult>(Task< TResult>[]) method returns a Task< TResult[]> with the results of all async operations. Task.WhenAll< TResult>(Task< TResult>[])方法返回一个包含所有异步操作结果的Task< TResult[]>

Once you have the results, you can merge them with Enumerable.SelectMany :获得结果后,可以将它们与Enumerable.SelectMany合并:

var flattened=results.SelectMany(r=>r).ToList();

While you can combine both operations, it's best to avoid it.虽然您可以将这两种操作结合起来,但最好避免它。 This results in code that's hard to read, maintain and debug.这导致代码难以阅读、维护和调试。 During debugging, you'll often want to break after the await to check results for eg nulls or other unexpected values.在调试期间,您通常希望在await后中断以检查结果,例如空值或其他意外值。

The tasks and flattening run on different threads, which makes debugging with the chained calls harder.任务和展平在不同的线程上运行,这使得使用链式调用进行调试变得更加困难。

If you really need to, you can use ContinueWith after WhenAll to process the results in a threadpool thread before returning them:如果确实需要,可以在WhenAll之后使用ContinueWith在线程池线程中处理结果,然后再返回它们:

var flatten=await Task.WhenAll(DataSource1(),DataSource2())
                      .ContinueWith(t=>t.Results.SelectMany(r=>r)
                                        .ToList());

Update更新

To filter the sources, a quick & dirty way would be to create a Dictionary that maps source IDs to methods and use LINQ's Select to pick them:要过滤源,一种快速而肮脏的方法是创建一个Dictionary ,将源 ID 映射到方法并使用 LINQ 的Select来选择它们:

//In a field
Dictionary<DataSource,Func<Task<List<Result>>>> map=new (){
    [DataSource.Source1]=DataSource1,
    [DataSource.Source1]=DataSource2
};

//In the method
DataSource[] fetchSources=new DataSource[0];
var tasks=fetchSources.Select(s=>map[s]());

But that's little different from using a function to do the same job:但这与使用function做同样的工作没什么不同:

DataSource[] fetchSources=new DataSource[0];
var tasks=fetchSources.Select(s=>RunSource(s));
//or even 
//var tasks=fetchSources.Select(RunSource);
    
var results=await Task.WhenAll(tasks);
var flattened=results.SelectMany(r=>r).ToList();


public static Task<List<Result>> RunSource(DataSource source)
{
    return source switch {
            DataSource.Source1=> DataSource1(),
            DataSource.Source2=> DataSource2(),
            _=>throw new ArgumentOutOfRangeException(nameof(source))
    };
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM