简体   繁体   English

Python PriorityQueue 相同优先级的项目以随机顺序获得

[英]Python PriorityQueue items of same priority got in random order

I used the queue.Queue class for passing tasks from one thread to the other.我使用queue.Queue class 将任务从一个线程传递到另一个线程。 Later I needed to add priority so I changed it to PriorityQueue , using the proposed PrioritizedItem (because the tasks are dict and cannot be compared).后来我需要添加优先级,所以我将其更改为PriorityQueue ,使用建议的PrioritizedItem (因为任务是 dict 并且无法比较)。 Then, in rare situation, it started cause task mixup.然后,在极少数情况下,它开始导致任务混淆。 It took me a while to realise/debug that same priority items in the PriorityQueue do not keep the insertion order or, even worse from debugging point of view, usually they do.我花了一段时间才意识到/调试PriorityQueue中的相同优先级项目不会保持插入顺序,或者从调试的角度来看更糟糕的是,通常它们会保持插入顺序。

I guess, FIFO is a sort of default, when talking about task queues.我想,在谈论任务队列时,FIFO 是一种默认值。 This is why the Queue is not called like FifoQueue , isn't it?这就是为什么Queue不像FifoQueue那样被调用的原因,不是吗? So, PriorityQueue should explicitly state that it is not FIFO for equal-priority items.因此, PriorityQueue应该明确 state 不是等优先级项目的 FIFO。 Unfortunately, the Python doc does not warn us about this, and that lack of warning caused headache for me, and probably others too.不幸的是,Python 文档没有警告我们这一点,缺乏警告让我很头疼,可能其他人也很头疼。

I have not found any ready-made solution, but I am pretty sure others may need a PriorityQueue that keeps the insertion order for equal-priority items.我还没有找到任何现成的解决方案,但我很确定其他人可能需要一个PriorityQueue来保持同等优先级项目的插入顺序。 Hence this ticket...所以这张票...

Besides I hope the Python doc will state the warning in some next release, let me share how I solved the problem.此外,我希望 Python 文档将 state 在下一个版本中发出警告,让我分享一下我是如何解决这个问题的。

heapq (used by PriorityQueue ) proposes that we need to insert a sequence number into the compared section of the item so that the calculated priority is obvious and avoid having 2 items with the same priority. heapq (由PriorityQueue使用)建议我们需要在 item 的 compare 部分插入一个序列号,以便计算出的优先级显而易见,避免有 2 个 item 具有相同的优先级。

I also added the threading.Lock so that we avoid having the same sequence number for 2 items just because some thread racing situation occurred.我还添加了threading.Lock ,这样我们就可以避免因为发生了一些线程竞争情况而使 2 个项目具有相同的序列号。

class _ThreadSafeCounter(object):
    def __init__(self, start=0):
        self.countergen = itertools.count(start)
        self.lock = threading.Lock()
    def __call__(self):
        with self.lock:
            return self.countergen.__next__()

#create a function that provides incremental sequence numbers
_getnextseqnum = _ThreadSafeCounter()

@dataclasses.dataclass(order=True)
class PriorityQueueItem:
    """Container for priority queue items
    
    The payload of the item is stored in the optional "data" (None by default), and
    can be of any type, even such that cannot be compared, e.g. dict.
    
    The queue priority is defined mainly by the optional "priority" argument (10 by
    default).
    If there are more items with the same "priority", their put-order is preserved,
    because of the automatically increasing sequence number, "_seqnum".
    Usage in the producer:
        pq.put(PriorityQueueItem("Best-effort-task",100))
        pq.put(PriorityQueueItem(dict(b=2))
        pq.put(PriorityQueueItem(priority=0))
        pq.put(PriorityQueueItem(dict(a=1))
    Consumer is to get the tasks with pq.get().getdata(), and will actually receive
        None
        {'b':2}
        {'a':1}
        "Best-effort-task"
    """
    data: typing.Any=dataclasses.field(default=None, compare=False)
    priority: int=10
    _seqnum: int=dataclasses.field(default_factory=_getnextseqnum, init=False)
    
    def getdata(self):
        """Get the payload of the item in the consumer thread"""
        return self.data

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM