简体   繁体   English

如何跟踪长时间运行的并行程序的运行位置

[英]how to keep track of running location for a long running parallel program

I have a restartable program that runs over a very large space and I have started parallelizing it some. 我有一个可重新启动的程序,该程序在很大的空间上运行,并且已经开始对其进行并行化。 Each Task runs independently and updates a database with its results. 每个任务独立运行,并使用其结果更新数据库。 It doesn't matter if tasks are repeated (they are fully deterministic based on the input array and will simply generate the same result they did before), but doing so is relatively inefficient. 重复任务并不重要(它们是完全基于输入数组的确定性的,并且只会产生与以前相同的结果),但是这样做效率相对较低。 So far I have come up with the following pattern: 到目前为止,我已经提出了以下模式:

    static void Main(string[] args) {
        GeneratorStart = Storage.Load();
        var tasks = new List<Task>();
        foreach (int[] temp in Generator()) {
            var arr = temp;
            var task = new Task(() => {
                //... use arr as needed
            });
            task.Start();
            tasks.Add(task);
            if (tasks.Count > 4) {
                Task.WaitAll(tasks.ToArray());
                Storage.UpdateStart(temp);
                tasks = new List<Task>();
            }
        }
    }

Prior to making the generator restartable, I had a simple Parallel.Foreach loop on it and was a bit faster. 在使生成器可重新启动之前,我在其上使用了一个简单的Parallel.Foreach循环,速度稍快一些。 I think I am losing some CPU time with the WaitAll operation. 我想通过WaitAll操作会浪费一些CPU时间。 How can I get rid of this bottleneck while keeping track of what tasks I don't have to run again when I restart? 如何在跟踪重新启动时不必再次运行的任务时摆脱瓶颈?

Other bits for those concerned (shortened for brevity to question): 有关人员的其他介绍(为简洁起见,以下简称):

class Program {
    static bool Done = false;
    static int[] GeneratorStart = null;
    static IEnumerable<int[]> Generator() {
        var s = new Stack<int>();
        //... omitted code to initialize stack to GeneratorStart for brevity
        yield return s.ToArray();
        while (!Done) {
            Increment(s);
            yield return s.Reverse().ToArray();
        }
    }

    static int Base = 25600; //example number (none of this is important
    static void Increment(Stack<int> stack) { //outside the fact 
        if (stack.Count == 0) {               //that it is generating an array
            stack.Push(1);                    //of a large base
            return;                           //behaving like an integer
        }                                     //with each digit stored in an
        int i = stack.Pop();                  //array position)
        i++;
        if (i < Base) {
            stack.Push(i);
            return;
        }
        Increment(stack);
        stack.Push(0);
    }
}

I've come up with this: 我想出了这个:

    var tasks = new Queue<Pair<int[],Task>>();
    foreach (var temp in Generator()) {
        var arr = temp;
        tasks.Enqueue(new Pair<int[], Task>(arr, Task.Run(() ={
            //... use arr as needed
        }));
        var tArray = t.Select(v => v.Value).Where(t=>!t.IsCompleted).ToArray();
        if (tArray.Length > 7) {
            Task.WaitAny(tArray);
            var first = tasks.Peek();
            while (first != null && first.B.IsCompleted) {
                Storage.UpdateStart(first.A);
                tasks.Dequeue();
                first = tasks.Count == 0 ? null : tasks.Peek();
            }
        }
    }

...
class Pair<TA,TB> {
    public TA A { get; set; }
    public TB B { get; set; }
    public Pair(TA a, TB b) { A = a; B = b; }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM