简体   繁体   English

如何等待物品通过管道?

[英]How to wait until item goes through pipeline?

So, I'm trying to wrap my head around Microsoft's Dataflow library. 所以,我试图围绕微软的Dataflow库。 I've built a very simple pipeline consisting of just two blocks: 我构建了一个非常简单的管道,只包含两个块:

var start = new TransformBlock<Foo, Bar>();
var end = new ActionBlock<Bar>();
start.LinkTo(end);

Now I can asynchronously process Foo instances by calling: 现在我可以通过调用异步处理Foo实例:

start.SendAsync(new Foo());

What I do not understand is how to do the processing synchronously, when needed. 我不明白的是如何在需要时同步进行处理。 I thought that waiting on SendAsync would be enough: 我认为等待SendAsync就足够了:

start.SendAsync(new Foo()).Wait();

But apparently it returns as soon as item is accepted by first processor in pipeline, and not when item is fully processed. 但显然,只要项目被管道中的第一个处理器接受,它就会返回,而不是当项目被完全处理时。 So is there a way to wait until given item was processed by last ( end ) block? 那么有没有办法等到最后( end )块处理给定项目? Apart from passing a WaitHandle through entire pipeline. 除了将WaitHandle传递给整个管道。

In short that's not supported out of the box in data flow. 简而言之,数据流中不支持开箱即用。 Essentially what you need to do is to tag the data so you can retrieve it when processing is done. 基本上,您需要做的是标记数据,以便在处理完成后检索数据。 I've written up a way to do this that let's the consumer await a Job as it gets processed by the pipeline. 我写了一个方法来做这个让消费者await一个Job因为它被管道处理。 The only concession to pipeline design is that each block take a KeyValuePair<Guid, T> . 管道设计的唯一让步是每个块采用KeyValuePair<Guid, T> This is the basic JobManager and the post I wrote about it. 这是基本的JobManager和我写的关于它的帖子 Note the code in the post is a bit dated and needs some updates but it should get you in the right direction. 请注意,帖子中的代码有点过时,需要一些更新,但它应该让您朝着正确的方向前进。

namespace ConcurrentFlows.DataflowJobs {
    using System;
    using System.Collections.Concurrent;
    using System.Collections.Generic;
    using System.Threading.Tasks;
    using System.Threading.Tasks.Dataflow;

    /// <summary>
    /// A generic interface defining that:
    /// for a specified input type => an awaitable result is produced.
    /// </summary>
    /// <typeparam name="TInput">The type of data to process.</typeparam>
    /// <typeparam name="TOutput">The type of data the consumer expects back.</typeparam>
    public interface IJobManager<TInput, TOutput> {
        Task<TOutput> SubmitRequest(TInput data);
    }

    /// <summary>
    /// A TPL-Dataflow based job manager.
    /// </summary>
    /// <typeparam name="TInput">The type of data to process.</typeparam>
    /// <typeparam name="TOutput">The type of data the consumer expects back.</typeparam>
    public class DataflowJobManager<TInput, TOutput> : IJobManager<TInput, TOutput> {

        /// <summary>
        /// It is anticipated that jobHandler is an injected
        /// singleton instance of a Dataflow based 'calculator', though this implementation
        /// does not depend on it being a singleton.
        /// </summary>
        /// <param name="jobHandler">A singleton Dataflow block through which all jobs are processed.</param>
        public DataflowJobManager(IPropagatorBlock<KeyValuePair<Guid, TInput>, KeyValuePair<Guid, TOutput>> jobHandler) {
            if (jobHandler == null) { throw new ArgumentException("Argument cannot be null.", "jobHandler"); }

            this.JobHandler = JobHandler;
            if (!alreadyLinked) {
                JobHandler.LinkTo(ResultHandler, new DataflowLinkOptions() { PropagateCompletion = true });
                alreadyLinked = true;
            }
        }

        private static bool alreadyLinked = false;            

        /// <summary>
        /// Submits the request to the JobHandler and asynchronously awaits the result.
        /// </summary>
        /// <param name="data">The input data to be processd.</param>
        /// <returns></returns>
        public async Task<TOutput> SubmitRequest(TInput data) {
            var taggedData = TagInputData(data);
            var job = CreateJob(taggedData);
            Jobs.TryAdd(job.Key, job.Value);
            await JobHandler.SendAsync(taggedData);
            return await job.Value.Task;
        }

        private static ConcurrentDictionary<Guid, TaskCompletionSource<TOutput>> Jobs {
            get;
        } = new ConcurrentDictionary<Guid, TaskCompletionSource<TOutput>>();

        private static ExecutionDataflowBlockOptions Options {
            get;
        } = GetResultHandlerOptions();

        private static ITargetBlock<KeyValuePair<Guid, TOutput>> ResultHandler {
            get;
        } = CreateReplyHandler(Options);

        private IPropagatorBlock<KeyValuePair<Guid, TInput>, KeyValuePair<Guid, TOutput>> JobHandler {
            get;
        }

        private KeyValuePair<Guid, TInput> TagInputData(TInput data) {
            var id = Guid.NewGuid();
            return new KeyValuePair<Guid, TInput>(id, data);
        }

        private KeyValuePair<Guid, TaskCompletionSource<TOutput>> CreateJob(KeyValuePair<Guid, TInput> taggedData) {
            var id = taggedData.Key;
            var jobCompletionSource = new TaskCompletionSource<TOutput>();
            return new KeyValuePair<Guid, TaskCompletionSource<TOutput>>(id, jobCompletionSource);
        }

        private static ExecutionDataflowBlockOptions GetResultHandlerOptions() {
            return new ExecutionDataflowBlockOptions() {
                MaxDegreeOfParallelism = Environment.ProcessorCount,
                BoundedCapacity = 1000
            };
        }

        private static ITargetBlock<KeyValuePair<Guid, TOutput>> CreateReplyHandler(ExecutionDataflowBlockOptions options) {
            return new ActionBlock<KeyValuePair<Guid, TOutput>>((result) => {
                RecieveOutput(result);
            }, options);
        }

        private static void RecieveOutput(KeyValuePair<Guid, TOutput> result) {
            var jobId = result.Key;
            TaskCompletionSource<TOutput> jobCompletionSource;
            if (!Jobs.TryRemove(jobId, out jobCompletionSource)) {
                throw new InvalidOperationException($"The jobId: {jobId} was not found.");
            }
            var resultValue = result.Value;
            jobCompletionSource.SetResult(resultValue);            
        }
    }
}

I ended up using the following pipeline: 我最终使用以下管道:

var start = new TransformBlock<FooBar, FooBar>(...);
var end = new ActionBlock<FooBar>(item => item.Complete());
start.LinkTo(end);
var input = new FooBar {Input = new Foo()};
start.SendAsync(input);
input.Task.Wait();

Where 哪里

class FooBar
{
    public Foo Input { get; set; }
    public Bar Result { get; set; }
    public Task<Bar> Task { get { return _taskSource.Task; } }

    public void Complete()
    {
        _taskSource.SetResult(Result);
    }

    private TaskCompletionSource<Bar> _taskSource = new TaskCompletionSource<Bar>();
}

Less than ideal, but it works. 不太理想,但它的工作原理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM