简体   繁体   English

如何在C#中实现此并行模式?

[英]How do I implement this parallel pattern in C#?

What I'm ultimately trying to accomplish is to get the HTML from an unknown but limited number of webpages where GetPage(i) returns the HTML for page i and I want to stop as soon as I found a non-page. 我最终要完成的工作是从未知但数量有限的网页中获取HTML,其中GetPage(i)返回第i页的HTML,我想在发现非页面后立即停止。

The exact pattern I'm going for is like this: 我要使用的确切模式是这样的:

  • Start N parallel tasks that are GetPage(0) , ..., GetPage(N-1) . 启动N并行任务,分别是GetPage(0) ,..., GetPage(N-1)
  • As soon as a task GetPage(i) completes, if the task was able to get the page, add it to a collection of pages and try to get the next largest page that hasn't tried to be processed yet; 任务GetPage(i)一旦完成,如果任务能够获取页面,则将其添加到页面集合中,并尝试获取尚未尝试处理的下一个最大页面; or if the task was not able to get the page, cancel all tasks GetPage(j) where j>i . 或者,如果任务无法获取页面,请取消所有任务GetPage(j) ,其中j>i

So my attempted implementation is like 所以我尝试的实现就像

        var docs = new LinkedList<HtmlDocument>();
        int tlimit = 20;
        var tasks = new Task<HtmlDocument>[tlimit];
        for(int i = 0; i < tlimit; ++i)
        {
            tasks[i] = Task<HtmlDocument>.Factory.StartNew(() => BoardScanner.GetBoardPage(i));
        }
       /// ???

This might be what you're looking for. 这可能是您要寻找的。 It will treat your documents in parallel and async possibly starting with the largest one (not guaranteed though). 它将并行并异步处理您的文档,可能从最大的文档开始(尽管不能保证)。 If you really want to go by the largest one, then you'll have to tweak this quite a bit. 如果您真的想获得最大的收益,那么您就必须对其进行一些调整。 And I am not sure if you will end up with parallel processing for parallel sort of means that you give more control to the machine in terms of deciding what gets treated first: 而且我不确定您是否最终会以并行方式进行并行处理,从而在决定首先要处理的内容方面为计算机提供了更多控制权:

var docs = new List<XmlDocument>();

            var tasks = docs.OrderBy(p => p.InnerXml.Length).Select(file => Task.Run(async () =>
           {
               await BoardScanner.GetBoardPage(file);
                 // your document treatment logic here
             }
                 ));

            await Task.WhenAll(tasks);
            // your logic upon all of your documents were treated;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM