优先队列：并行处理

Question

I am trying to solve this algorithmic problem about priority queue on Coursera but the grader on their website keeps saying that my program failed with a time limit exceeded error. 我正在尝试解决有关Coursera上优先级队列的算法问题，但他们网站上的评分员一直在说我的程序失败，出现了超过时间限制的错误。 The fact is that when I run it on my PC with huge input (5000 threads, 100000 jobs), it works smoothly and prints the correct result in no more than 1 second. 事实是，当我在具有大量输入（5000个线程，100000个作业）的PC上运行它时，它可以平稳运行并在不超过1秒的时间内打印出正确的结果。

This is the problem description: 这是问题描述：

在此处输入图片说明

This is the link to my code on Github: https://gist.github.com/giantonia/3ddbacddc7bd58b220ab592f802d9602 这是我在Github上的代码的链接： https : //gist.github.com/giantonia/3ddbacddc7bd58b220ab592f802d9602

Any help appreciated ! 任何帮助表示赞赏！

Answer 1

The weakest point of your code is below, 您的代码最弱的地方是

while len(jobs) > 0:
    if threads[0][1] <= time:
      ...
    else:
        time += 1

This loop will be executed along with the time, not the number of jobs have to be done. 该循环将与时间一起执行，而不必完成许多工作。 It requires O(MAX_T) cost! 它需要O（MAX_T）的费用！ Too slow! 太慢了！

This is my solution regarding this problem. 这是我针对此问题的解决方案。 It requires O(N + MlgN)). 它需要O（N + MlgN））。

The idea is quite simple. 这个想法很简单。

Construct priority_queue along with the earliest time to finish. 构造priority_queue以及最早完成的时间。
Pick next_thread from priority_queue and update its the time to finish the job. 从priority_queue中选择next_thread并更新其时间以完成作业。
Insert it into priority queue 将其插入优先级队列

Here is the code, 这是代码，

# python3

def parent_key_cal(key):
    if key % 2 == 0:
        parent_key = key//2
    else:
        parent_key = (key - 1)//2
    return parent_key

def swap(alist, key1, key2):
    temp = alist[key1]
    alist[key1] = alist[key2]
    alist[key2] = temp

def return_min_key(alist, parent, left, right, criteria):

    min_value = parent
    if alist[parent][criteria] > alist[left][criteria]:
        min_value = left
        if right != -1 and alist[min_value][criteria] > alist[right][criteria]:
            min_value = right
    elif alist[parent][criteria] < alist[left][criteria]:
        if right != -1 and alist[min_value][criteria] > alist[right][criteria]:
            min_value = right

    return min_value

def shift_up(alist, key):

    while key > 1:

        parent = parent_key_cal(key)
        if alist[parent][1] != alist[key][1]:
            if alist[parent][1] > alist[key][1]:
                swap(alist, parent, key)
                key = parent
            else:
                break
        else:
            if alist[parent][0] > alist[key][0]:
                swap(alist, parent, key)
                key = parent
            else:
                break

def shift_down(alist, key):

    if 2*key >= len(alist):
        return

    parent = key
    left = 2*key
    right = 2*key + 1

    if right >= len(alist):

        if (alist[parent] == alist[left]) == True:
            min_value = return_min_key(alist, parent, left, -1, 0)
        else:
            min_value = return_min_key(alist, parent, left, -1, 1)

    else:

        if (alist[parent] == alist[left] == alist[right]) == True:
            min_value = return_min_key(alist, parent, left, right, 0)
        else:
            min_value = return_min_key(alist, parent, left, right, 1)

    if min_value != parent:
        swap(alist, parent, min_value)
        shift_down(alist, min_value)     


def min_heap(alist):
    # Index 0 element is dummy. minimum element's index is 1
    min = alist[1]
    alist.pop(1)

    # Maintain heap structure
    parent_last_element = parent_key_cal(len(alist)-1)
    for key in reversed(range(1, parent_last_element + 1)):
        shift_down(alist, key)

    return min

def heap_insert(alist, value):
    alist.append(value)
    shift_up(alist, len(alist)-1)

line1 = input().split()
n = int(line1[0])
m = int(line1[1])
jobs = list(map(int, input().split()))
threads = []
for i in range(n):
    threads.append([i, 0])

# Insert dummy element to make heap calculation easier
threads.insert(0,[-1,-1])

record = []
# O(M)
while len(jobs) > 0:
    # Allocate a job to a thread and record it this moment
    # "threads" is min_heap along with time to finish a allocated job. 0 -> thread order, 1 ->  time to finish the job
    next_thread = min_heap(threads)  # O(lgN)
    record.append([next_thread[0], next_thread[1]])  

    # Updated poped thread as much as time to finish the next job
    next_thread[1] += jobs.pop(0) 

    # Insert this into min_heap
    heap_insert(threads, next_thread)

for i in range(len(record)):
    print(str(record[i][0]) + ' ' + str(record[i][1]))

Answer 2

Firstly, I'll recommend to run the solution on the maximum test locally (that is n = 100000 and m = 100000) (yeah, 5000 and 100000 is big test, but do you stop there? Why don't you use the maximum possible test case?). 首先，我建议在本地最大测试上运行该解决方案（即n = 100000和m = 100000）（是的，5000和100000是大测试，但是您到此为止吗？为什么不使用最大测试可能的测试用例？）。

Secondly, there at least two flaws in your solution: 其次，您的解决方案至少存在两个缺陷：

It increases the time by one instead of jumping to the next event: 它使时间增加一个，而不是跳到下一个事件：
```
 while len(jobs) > 0: if threads[0][1] <= time: record.append([threads[0][0], time]) ... else: time += 1 
```
It requires O(MAX_T) operations. 它需要O(MAX_T)操作。 That's too much if the maximum time is 10^9. 如果最大时间是10 ^ 9，那么太多了。
jobs.pop(0) might work in O(n) (it depends on the python implementation, but if it works like C++ vector, which is the case for many interpreters), which yields O(n^2) operations in the worst case. jobs.pop(0)可能会在O(n) （这取决于python的实现，但是如果它像C ++ vector那样工作（许多解释程序就是这种情况）），则在最坏的情况下会产生O(n^2)操作案件。 That's too much, too. 太多了

There might be other slow parts in your solution (I saw these two immediately, so I wrote just about them). 您的解决方案中可能还有其他慢的部分（我立即看到了这两部分，所以我只写了它们）。

I'd recommend you to redesign the algorithm, prove that it's fast enough (hint: it should be something like O((n + m) log n) ) and only after that implement it. 我建议您重新设计该算法，证明它足够快（提示：它应该类似于O((n + m) log n) ），只有在实现该算法之后。

优先队列：并行处理

问题描述

2 个解决方案

解决方案1
1 2016-12-16 18:18:45

解决方案2
0 2016-12-16 16:57:44

优先队列：并行处理

问题描述

2 个解决方案

解决方案1 1 2016-12-16 18:18:45

解决方案2 0 2016-12-16 16:57:44

解决方案1
1 2016-12-16 18:18:45

解决方案2
0 2016-12-16 16:57:44